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A CONTROL FACTOR IN SOCIAL ADJUSTMENT’ 


BY DANIEL 


N. WIENER 


Veterans Administration, St. Paul 


INTRODUCTION 


HE phrase “in control of himself” is 
often used to designate the person able 
to direct his own activities, to adapt to 
present social demands, to plan for the future. 
“Out of control” is a term often used to 
describe the individual who seems at the 
mercy of immediate environmental stimuli. 
Previous theory and research in this area have 
been extensively and critically reviewed by 
Meehl (2). In his original work on the “gen- 
eral normality” or “control” factor, Meehl dis- 
cusses the difference between two individuals 
with equally deviate personality test scores, 
one of whom appears objectively to be mak- 
ing adequate social adjustment, though “feel- 
ing blue,” and the other whose test-indicated 
depression incapacitates him socially (2). 
Development of the K factor (3) and of 
subtle (S) and obvious (O) keys (6) for the 
Minnesota Multiphasic Personality Inventory 
(MMPI) had a common origin in the attempt 


to develop more valid test results for the 
MMPI by measuring personality characteris- 
tics which affect test results in a way not 
taken into account when the test was orig- 


inally constructed. 

Originally, the assumption in developing 
the S and O keys was similar to that in the 
development of the K factor in that the 
attempt was made to obtain more valid single 
scores on each scale of the MMPI. The K 
factor attempted to do this by correcting each 
scale of the test for an attitude toward test- 
taking, while the S and O keys attempted to 
do the same thing by determining, for psycho- 
logically sophisticated, intelligent, or test-wise 
individuals, scores on S items whose implica- 
tions could not be detected by the test-taker. 

However, tabulations made of selected 
groups indicated that the differences between 
the S and O scores themselves had significance 
(6). Those individuals whose actual social 
adjustment, or potentialities for adjustment, 

1Dr. William Hales, with data, and Dr. Paul Meehl, 
with viewpoints, have made substantial contributions to 


this article. However, the author alone accepts respon- 
sibility for the opinions expressed. 


seemed the least, tended to have O scores 
higher than S. On the other hand, the more 
successful individuals tended to have S$ T- 
scores equal to or higher than their O scores. 
Previous studies have generally confirmed the 
above generalizations with small groups of 
successful and unsuccessful salesmen (5), suc- 
cessful and unsuccessful trade school trainees, 
high and low intelligence groups, and well- 
educated and poorly-educated individuals (6). 

The tentative hypothesis derived from the 
above data may be stated as follows: success- 
ful adjustment in society requires knowledge 
of socially acceptable ways of behavior and 
the desire and ability to act in these ways. 
The socially acceptable way to behave on the 
personality test, as well as more overtly, seems 
to include avoiding deviate behavior. On 
the MMPI, the most deviate items are the 
O items (6),“deviate” because they are seldom 
answered in a significant direction by a nor- 
mal population. The socially successful person 
may have the ability to recognize and to 
avoid making scores on personality test items 
which obviously indicate maladjustment, 
while the socially unsuccessful person may be 
unable to recognize or to heed signs of devi- 
ate behavior on a personality test. 

This hypothesis requires the assumption 
of no single “adjusted” or “successful” per- 
sonality test profile. It is recognized that suc- 
cessful adjustment may show itself in many 
different configurations of personality test fac- 
tors. A “control” factor may be postulated 
which affects the various scales of a person- 
ality test in different ways. This concept and 
its application may permit mere valid predic- 
tion of actual or potential patterns of behavior. 


METHOD 


With the very active and helpful coopera- 
tion of Dr. William M. Hales, it has been 
possible partially to test this hypothesis. Most 
veterans discharged from the service with 
neuropsychiatric diagnoses have now had 
several years in which to adjust to civilian 
society. The nature of their adjustment may 
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be dichotomized, simply, by saying that one 
group is now hospitalized in mental institu- 
tions while the other is not. If two such 
groups can be matched in background, an 
analysis of their differences on a personality 
test may throw some light on the question of 
whether a test “control” factor can be detected 
which may improve the accuracy of prognosis 
of hospitalization. This study is limited to 
individuals who have had schizophrenic epi- 
sodes, because an adequate number of cases 
was available only for this group. 

Two groups of veterans with diagnoses in- 


The 


dicating schizophrenia were obtained. 


WIENER 


Resutts * 

Data on the backgrounds of the subjects 
(Ss) are given in Table 1. Mean education 
was practically identical for the two groups, 
at approximately 11 grades completed. While 
the mean age difference was only one and 
one-half years, this difference approached sta- 
tistical significance (5 per cent level). How- 
ever, it is extremely unlikely that a difference 
this small could affect MMPI results. 

Figure 1 portrays the mean MMPI T-scores, 
excluding S and O, for the hospitalized and 
non-hospitalized groups. With the exceptions 
of K, Hy, Mf, and Ma, the mean T-scores 
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anp Non-HosPItTAaLizep 


+2 


SCHIZOPHRENICS 


diagnoses were made by psychiatrists before 
Both groups, serially 
selected, consisted of white, male veterans, 


the tests were given. 


whom the MMPI test was 
routinely given. All MMPI scores were cor- 
rected for K. One group consisted of 100 
cases in a single mental hospital, while the 
other group, 52 cases, was composed of men 
not in the hospital at the time of case selec- 
tion. Differences in test results that appear 
between the two groups are not maximum 


ages 15-35, to 


because overlap in case selection occurred: 
that is, certain hospitalized cases were on the 
verge of being discharged, while some non- 
hospitalized cases had been and would be 


hospitalized. 


- 52 Non-Hospitalized Cases. 


for the hospitalized group are higher than for 
the non-hospitalized group. The chief char- 
acteristics of the profile of the hospitalized 
group are the high elevations in Pt and Se, 
providing further evidence of the validity of 
these scales. The profile for the non-hospital- 
ized groups shows no such outstanding eleva- 
tions: the low mean profile, combined with 
the fact of non-hospitalization, suggests the 

2 The basic data are not reproduced here, but a table 
giving the ind standard deviations of the various 

wes for each group and the critical ratios for the 
differences between the means can be obtained by order- 
ing Document from American Documentation 
Institute, 1719 N St., N.W., Washington 6, D. C., remit- 
ting $o.so for either microfilm (images 1 inch high on 
standard 35 mm. motion picture film) or a photocopy 
(6x 8 inches) readable without optical aid. 


means 
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TABLE 1 


DirrereNces iN AGE AND Epucation oF Hospirat- 
IZED AND Non-Hospitatizep MALES WITH 


SCHIZOPHRENIC DIAGNOsIS 


100 Hospitalized 
51 Non-Hospitalized 
(Data missing for one N-H case) 


2 


ized are so designated in this study) had S 
T-scores higher than O scores, while the un- 
successful (in this study, the hospitalized) 
groups had O scores much higher than their 
S scores. On all five scales for which the S-O 
keys have been developed, the non-hospital- 
ized had higher S 7-scores than the hospital- 
ized, the difference on two scales (D-S, Hy-S) 
being significant at the 1 per cent level (Fig- 
ure 2). On the O keys, the hospitalized had 
higher scores than the non-hospitalized, sig- 
nificant at the 1 per cent level on all <cales 
except for Hy-O. 

The sharpest differences between the hos- 


M; M. 
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Legend: - 100 Hospitalized Cases, 52 


possible invalidity of some of the present 
psychiatric diagnoses of schizophrenia. How- 
ever, the diagnoses of schizophrenia have been 
subject in almost all cases to from two to four 
psychiatric follow-up examinations. 


Differences between the two groups are 
significant at the 1 per cent level only for the 
scales of Pa, Pt, and Sc. Using a cutoff point 
of T-7o on the Sc scale alone, 62 per cent of 
the hospitalized cases would be detected, 
while 38 per cent would be overlooked. For 
the non-hospitalized group, 33 per cent had 
Sc T-scores of 70 and above. 

Results from the S and O keys confirm the 
generalization of previous S-O studies. The 
relatively successful groups (the non-hospital- 


Nx 


yn-Hospitalized Cases Successful Trainees. 

pitalized and non-hospitalized group are on 
the O keys (Figure 3). In the original con- 
struction of the S and O keys, it was found 
impossible to divide the items of the P# and 
Se scales into relatively subtle and obvious 
categories because all of the items seemed too 
obvious. If these two scales, Pt and Sc, are 
therefore considered together with the obvi- 
ous keys, the impression is even stronger 
that it is getting scores on obviously deviate 
items that differentiates relatively “successful” 
from relatively “unsuccessful” individuals in 
this study. This impression is further 
strengthened when scores obtained by a group 
of successful vocational trainees are plotted on 
the graphs (Figures 2 and 3). It is apparent 
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that these successful vocational trainees are 
even lower than the non-hospitalized schizo- 
phrenic group in obvious scores. 

It further appears that there may be con- 
siderable significance to the amount of gap 
existing between O and S T-scores. Whether 
this relationship is exclusively one of getting 
high O scores, or whether it is one of dynamic 
relationship between S and O scores, is a moot 
question. The “control” explanation which 
postulates a dynamic relationship is herein 
preferred because the S items apparently do 
not contribute to the validity of the total 
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Using this index alone, with a cutoff at 
zero, approximately the same number of hos- 
pitalized patients were selected (60 per cent) 
as by the Sc elevation (62 per cent). However, 
6 per cent fewer of the non-hospitalized group 
had significant scores on this S-O index than 
on the Se scale. 

It might be expected that some of the non- 
hospitalized should properly be classified with 
the hospitalized groups, since many have been 
or will be hospitalized. However, it also 
appears valuable to differentiate as greatly as 
possible between the two groups. 

P, 
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HosPitaLizeEp AND Non-HosPItAizep 


SCHIZOPHRENICS 


Hospital d Cases, 
MMPI scale scores in a non-institutionalized 
That is, there appears to be a 
in this study, as well as in 


population. 
slight tendency 
previous ones (5, 6), for successful groups 
actually to obtain somewhat A:gher scores on 
the S items than do unsuccessful groups. 
On the basis of similar previous findings, 
lex was developed to sum- 
marize the S and O 
T-scores on the scales of a single test. A plus 
I score is given to an individual each time his 
S T-score is equal to or higher than his O 
while a minus 1 is assigned each time 
his O score is 10 or more T-scores above his 


a simple S-O in 


differences be tween 


T-score ’ 


S score. Thus, the possible range of scores for 
an individual MMPI profile is plus 5 to minus 
5, since S and O keys have been developed for 
five of the MMPI scales. 


only 


---- 52 Non-Hospitalized Cases, 


Successful Trainees 


Using both the S-O difference and an 
, 78 per cent of the 
hospitalized cases were selected, while only 40 
per cent of the non-hospitalized group were 
similarly selected. This combination of Se 
with the S-O ratio, then, both increased the 
number of hospitalized cases selected and 
widened the difference between the hospital- 
ized and non-hospitalized groups in numbers 
of cases with significant signs. It was possible 
to pick out 16 per cent of the hospitalized 
cases by the S-O ratio in addition to those 
indicated by a Se T-score of 70, while adding 
only 7 per cent to the number of the non- 
hospitalized similarly indicated. 

Significant differences between hospitalized 
and non-hospitalized groups on the MMPI 
Elevations on 


Se score of 70 and above 


scales are indicated in Table 2. 
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the scales of F, Pa, Pt, Sc, D-O, Pd-O, and 
Ma-O distinguish the non-hospitalized group. 

Perhaps more important than these imme- 
diate prognostic advantages, however, is the 
enriched clinical interpretation made possible 
by the use of the S and O keys in addition to 
the regular scales of the MMPI. They permit 
the introduction, testing, and use of additional 
hypotheses to increase the adequacy of indi- 
vidual diagnosis and treatment. 

The hypothesis presented and studied here 
has been that recognition and avoidance of 
behavior which is socially deviate, marking 
of test items which subtly indicate maladjust- 


SUMMARY AND CONCLUSIONS 

1. While the original validation studies 
for the MMPI indicated that both S and O 
items help to distinguish groups in mental 
hospitals from non-hospitalized groups, sub- 
sequent work suggests that S and O items 
differ in direction of discrimination with non- 
hospitalized groups. S scores tend to be 
higher for successful groups, while O scores 
tend to be higher for unsuccessful groups. 

2. Sand O scores on the MMPI were studied 
for hospitalized and non-hospitalized men 
with psychiatric diagnoses of schizophrenia, 
of similar education, and close in age to deter- 


TABLE 2 


MMPI 7-Score ELevations DisTINGUISHING BETWEEN HospITALizep AND Non-HospitTAizep SCHIZOPHRENICS 
at t Per Cent Lever * 


Hospitalized higher on 


Non-Hospitalized higher on 





* O = obvious scores, § = subtle scores 


ment, and being “adjusted” or “successful” 
tend to go together. Similarly, lack of sensi- 
tivity to or avoidance of unusual behavior, 
marking of test items which indicate obvi- 
ously deviate behavior, and lack of success in 
society apparently tend to go together. 

A “control” factor may be postulated which 
may be indicated by scores on the subtle 
items of a personality test. These items 
may initially discriminate positively between 
“abnormal” and “normal” groups of per- 
sons in validity studies. They apparently 
tend to discriminate negatively, however, 
between the successful and unsuccessful 
within each group. High scores on obvious 
items, on the other hand, consistently sug- 
gest maladjustment or failure. 

It is this tendency of subtle items to have 
negative discriminating power which suggests 
a dynamic relationship justifying use of the 
term “control.” With the socially adjusted 
or successful, apparently the person tends to 
check test items subtly symptomatic of emo- 
tional disturbance and to avoid the obvious 
symptoms. This is an attempt to describe 
the data reduced to simple elements. The 
attempt has been made to avoid an explana- 
tion in terms of a psychological entity. 





Pd-@ Pa 
Hy-S 





Ma-O 





mine if the non-hospitalized resembled other 
relatively “successful” groups and if the hos- 
pitalized resembled other relatively “unsuc- 
cessful” groups. 

3. Results indicated that the hospitalized 
and non-hospitalized schizophrenics did dif- 
fer from each other in the same way as “un- 
successful” and “successful” groups in previous 
studies. That is, the hospitalized (“unsuc- 
cessful”) group had higher O scores than the 
non-hospitalized (“successful”), while the 
non-hospitalized tended to have higher S 
scores. 

4. Use of a simple S-O ratio discriminates 
the two schizophrenic groups from the gen- 
eral norm population as well as does the Sc 
scale, and also discriminates as well between 
the two groups. Using the S-O ratio together 
with the Se scale improves both kinds of 
discrimination. 

5- Social “adjustment,” here defined by a 
hospitalized-non-hospitalized dichotomy of 
schizophrenics (and defined in previous stud- 
ies as vocational success-failure and educa- 
tional success-failure), is suggested by S scores 
equal to or higher than O scores, while “mal- 
adjustment” is suggested by O scores higher 
than S. 
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6. Since the original data on the develop- 
ment of the S-O keys indicated that they are 
fairly independent aspects of MMPI scales, 
and further studies have indicated that S and 
O keys have practical significance in them- 
selves, their use in a wide range of psycho- 
logical testing situations should appreciably 
enrich the clinical interpretation possible from 
the MMPI. 

7. The hypothesis of a dynamic control 
factor tending to suppress indications of ob- 
vious symptoms is a possible explanation of 
the results. The danger of seeming to postu- 
late a psychological entity is recognized, 
however. Meehl’s critical attitude toward a 


“control” factor (2) and suggestion of a sim- 
ple continuum of subtlety to obviousness in 
items, provides another hypothesis. Further 
study is required. 


(A limited number of S and O key item 
lists are available upon request to the author.) 
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INFLUENCE THROUGH SOCIAL COMMUNICATION * 


BY KURT 


W. BACK 


Burcau of the Census 


INTRODUCTION 


HE experiment described in this paper 
"investigates a property of groups which 

has been given various names but will 
be called, here, “cohesiveness,” following the 
use in a study by Festinger, Schachter, and 
Back (1). Cohesiveness was defined by them 
as the resultant forces which are acting on the 
members to stay in a group; in other words, 
cohesiveness is the attraction of membership 
in a group for its members. In the study 
cited, it was found that under certain con- 
ditions there will be increased pressure 
toward uniformity within a group with in- 
crease in cohesiveness. In the present experi- 
ment, a laboratory situation was created in 
which the consequences of this relationship 
could be studied in detail. 

Because of the artificiality of the situation 
and a narrow range of subjects (mainly 
students in a psychology course), the results 
cannot be freely generalized. But it is prob- 
able that the relationship holds in various 
other circumstances. It has been noted fre- 
quently that members of cohesive groups 
hold uniform opinions and act in conformity 
to group pressure. The conditions under 
which increase in cohesiveness leads to in- 
crease in pressure toward uniformity are not 
precisely known. In the present case, the 
conditions were achieved by producing a 
situation which conformed to the principles 
given below, as rationale for the method used. 

From the relationship between the forces 
to remain in the group and pressure to agree 
on important topics, some other relationships 
can be deduced. 

1. The increase in pressure toward uni- 
formity should show itself in a discussion 
between members. Either members will 

1 This paper presents the principal part of a thesis 
submitted in partial fulfillment of the requirements for 
the degree of Doctor of Philosophy at the Massachusetts 
Institute of Technology. The author thanks Dr. Leon 
Festinger for much valuable advice and aid. The re- 
search was conducted under contract with the Office of 
Naval Research (N6-onr-—23212-NR 151-698). It is 
part of a program of research on social communication 


and social influence under the general supervision of 
Dr. Leon Festinger. 


attempt to influence each other more in 
highly cohesive groups, or they will be more 
receptive to influence. 

2. The basis for participation in a discus- 
sion of group members lies partly in individ- 
ual motives, which may vary between indi- 
viduals, and pressure arising from the group, 
which affects all members. Since the factor 
which is common to all members is larger in 
highly cohesive groups than in less cohesive 
groups, we would expect less individual dif- 
ferences in participation in these groups. 

3. As the pressure toward uniformity in 
highly cohesive groups is stronger, the activi- 
ties of these groups—discussion, for ex- 
ample—should have a greater effect on the 
members than activities of less cohesive 
groups. 

4. Weak pressures toward uniformity in 
less cohesive groups can therefore lead only 
to little changes in individual members. 


Hence, the preferred outcome should be a 
compromise solution where all members 
change their positions slightly and equally. 


In highly cohesive groups, individual mem- 
bers may change considerably. Agreement 
can be established at any point with little 
consideration given the degree to which some 
individuals may have to change. 

Individuals may want to belong to a group 
because they like the other members, because 
being a member of a group may be attractive 
in itself (for example, it may be an honor to 
belong to it), or because the group may 
mediate goals which are important for the 
members. All these are bases for attractive- 
ness of a group. Increase in any of these 
motives for the members of a group should 
lead to an increase in cohesiveness and 
should, therefore, lead to the same conse- 
quences in terms of the hypotheses stated 
above, although the specific ways of reaching 
them would presumably be different. In 
the experiment, therefore, groups were estab- 
lished on all three bases: personal attraction, 
task direction, and group prestige. The 
strength of cohesiveness for each basis was 
varied. 
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The main purpose of the experiment was to 
measure the effect of strength of cohesiveness 
on the pressure toward uniformity within 
groups and the consequences of this effect. 
At the same time the effect of different bases 
of cohesiveness could be studied. 


Tue METHOD OF THE EXPERIMENT 


Introduction 

The experiment attempted to produce a 
situation in which the relationship of attrac- 
tiveness of the group to influence could be 
studied, by fulfilling the following conditions: 

1. The topic on which influence was to be 
exerted should be equally new to all subjects 
in order to minimize differences in famili- 
arity with the problem; also, it should be far 
enough from the subjects’ usual topics of 
discussion that an attitude should not be 
anchored to membership in a group other 
than the experimental one. 

2. The content should be simple enough to 
be discussed in a relatively short time and to 
permit the short discussion to effect a meas- 
urable change. 

3. The attitude of the subjects should differ 
before any influence is exerted. This should 
be guaranteed by the experimental procedure, 
but the subjects should be unable to ascribe 
it to manipulation by the experimenter. 

4. Influence should not be required by the 
experimental situation. In this way the effect 
of attractiveness of the group could be better 
isolated than in a situation where an effect in 
the same direction is created by explicit 
instructions or implied requiredness. For 
instance, success of the group should not 
depend on agreement between group mem- 
bers, nor should the task of the group be 
facilitated if agreement is achieved. 

5. As far as possible, influence should be 
traced to specific attempts to influence. As a 
first approximation to this ideal, it should be 
possible to relate it to the total communica- 
tion of one specific group member. 

6. The proceeding of the experiment 
should be the only experience of the group. 
That is, the subjects should have little or no 
acquaintance with each other before the ex- 
perimental session and should not expect 
continuation of the group afterwards. 

In order to meet these conditions, the 
experiment included the following features: 


1. The topic of discussion was the inter- 
pretation of a set of pictures. This was an 
unusual task on which hardly any group 
standards could have been established outside 
the experimental situation. Photographs 
were taken especially for the experiment and 
were equally unfamiliar to all subjects. 

2. The pictures depicted a simple situation 
which could be discussed in a few minutes. 
They were so unclear that a change in inter- 
pretation was easily possible. 

3. Each subject received a set of three pic- 
tures, believing that all sets were identical. 
Actually, there were slight differences be- 
tween the sets which led to different interpre- 
tations. The differences were too slight to be 
detected in a discussion without seeing the 
photographs again. Furthermore, the fact 
that the subjects were forced to talk about a 
sequence of these pictures made it difficult to 
identify exact features by discussion. This 
device was successful, and subjects never 
realized that there were differences (see 
Figure 1 for the two sets). 

4. The experiment’ was introduced as a 
cooperative working situation; the eventual 
outcome, however, consisted of the independ- 
ent products of each subject. The discussion 
was introduced as an opportunity to improve 
their own stories. Necessity for influence was 
specifically denied, and both length and man- 
ner of the discussion were left to the subjects. 

5. In order to trace influence to one person 
only, the experimental groups consisted of 
pairs. 

6. Although most of the subjects attended 
the same class (of about two hundred and 
fifty students), each member of a pair 
attended a different discussion section of this 
class. After the session, each subject was 
asked whether he had known his partner 
previously, and if he did, the results of this 
group were discarded. As the subjects were 
recruited for a single session of one experi- 
ment, they did not expect any prolonged 
existence of the group. 

General Procedure 

Essentially, the same general procedure 
was followed in all experimental conditions. 
The subjects in the experiment were students 
of two large psychology classes at the Uni- 
versity of Michigan. Each experimental pair 
consisted of students of the same sex. 
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Set A Set B 


Fic. 1. THe PHotocrapHs Usep IN THE EXPERIMENT 


After the subjects were introduced to each ported by features of the pictures. The pictures, being 
taken from a film strip, form a sequence which you will 


other, each of them was taken to a different j3)- to reconstruct. Then yon wil welts o chee Gale 

room and given the following instructions: necting the pictures. Right now you will write a pre- 

liminary story. Then you will talk over your ideas with 

Your task is to write a story from a set of three photo- your partner and afterwards you will write a final story. 

graphs which depict quite a commonplace incident. This Remember, you should write a good story, but it is 

gives you an opportunity to give play to your imagina- important to make it plausible by the use of the available 
tion, although the story should be plausible and sup- clues. 
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In addition, they were given the special 
instructions appropriate to their experimental 
conditions, which will be explained later. 
Then they received the pictures and wrote 
the preliminary story. There was no time 
limit. 

When they came to- 
gether to At the start 
of the discussion, the subjects were reminded 
that its object was to help them to improve 
their own stories. They were cautioned that 


they were finished 


discuss their stories. 


it was not necessary to conclude with a com- 
mon story and that they could stop the dis- 
cussion at any time that they saw its useful- 
The amount and manner of 
left to the 


ness at an end. 
communication 
subjects. 

After the discussion, the subjects returned 
to their separate rooms to write their final 
stories. They were instructed: “Write what 
you now think to be the best story.” They 
could not see the pictures again; therefore, 
they could not check information which they 
had received from their partners. 

After the completion of the experiment, the 
subjects were told the significant features of 
the set-up, and all their questions were 
answered truthfully. In conclusion, they 
were asked not to discuss the experiment and 
thanked for their cooperation. 


was therefore 


Introduction of the Experimental Variables 


The experiment was designed to differen- 
pairs by the attractiveness of the 
group and on the basis of this attractiveness. 
Three sources of attractiveness were intro- 


tiate the 


duced: (1) attraction to the partner, (2) 
mediation of other goals (task direction), and 
(3) prestige of the group itself. Each of these 
variables was introduced in two different 
strengths. The combination of strength and 
type gave six different experimental treat- 
ments. A seventh treatment was introduced 
in which any force toward the group was 
kept at a minimum. The execution of this 
design required a technique which started at 
the time the subjects were recruited. 

When the subjects signed up in their 
classes, they were told only that they were 
going to participate in a group experiment. 
The sign-up blank included a- few questions 
which were ostensibly going to help in mak- 
ng up the groups. Some questions asked for 


self-description and_ self-ratings. A few 


pseudo-projective questions were included. 
By means of these, the experimenter could 
pretend that he was able to make some 
shrewd inferences about personality traits. 
The concluding questions read: “You will 
be paired with another student of your own 
sex. As we want people together who are 
congenial, can you describe the type of per- 
son you want to work with?” and “What 
would be the most objectionable traits in a 
person you would work with?” 

Personal Attraction. The questionnaire 
aided in controlling the personal attraction 
the subjects had for each other when they 
entered the discussion. In the treatments 
where attraction was to be the basis of 
cohesiveness, the experimenter referred to the 
questionnaire after giving the instructions 
and reported on the effectiveness of the 
matching. 

To create weak cohesiveness, he said, “You remember 
the questions you answered when you signed up in class? 
We tried to find a partner with whom you could work 
best. Of course, we couldn't find anybody who would 
fit the description exactly, but we found a fellow who 


responded to the main points, and you probably will 
like him. You should get along all right.” 


To create strong cohesiveness, he said, “You remember 
the questions you answered in class about the people you 
would like to work with? Of course, we usually cannot 
match people the way they want, but for you, we have 
found almost exactly the person you described. As a 
matter of fact, the matching was as close as we had 
expected to happen once or twice in the study, if at all. 
You'll like him a jot. What's even more, he described 
a person very much like you. It’s quite a lucky coinci- 
dence to find two people who are so congenial, and you 
should get along extremely well.” 


Task Direction. In the treatments where 
the group was to mediate goals, the outcome 
of the task was stressed. The experiment 
was introduced as a test; the importance of 
its result for the subject was varied to create 
different degrees of cohesiveness. The ques- 
tionnaire was mentioned in passing as an 
unsuccessful attempt to match partners. 


“This is a part of a study of 
their imaginations. We developed a 
ymewhat special procedure to test this ability.” After 
the instructions for the task were given, the experi 
“In this way, you will have the best 
and get a high score in the 
had some idea of putting people 
together congenial. But that didn’t work 
cause of schedule difficulties; so all we could do was 
to take into account the objections you stated.” 


For high cohesiveness, the same introduction to the 
task was given. After the instructions, the experimenter 
continued, “Remember, the whole test shows how well 
you can use your imagination: your product will be 


For low cohesit’eness 


| 
he way people use 


menter continued, 


hance to show your ability 


you know, we 


] who we 
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judged in comparison with that of other people. We 
intend, for instance, to compare students from this and 
other universities, and men and women. The group you 
are in is a special prize group. There are ten such 
groups, and the two members who produce the best 
story get five dollars each. You know, we had some 
idea of putting people together who were congenial, but 
that didn’t work out because of schedule difficulties. All 
we could do was to take into account the objections you 
stated,” 


Group Prestige. Another way in which 
cohesiveness was produced was by stressing 
the value of belonging to the group. This 
was done by making selection for this par- 
ticular group an important achievement. 
The rarity of this achievement was varied to 
create different strength of cohesiveness. 


Here, too, the idea of being matched by per- 
sonality was played down. 


For low cohesiveness, the experiment was introduced: 
“This is part of a study in the use of imagination. We 
are trying to compare good groups and bad groups in 
this type of work, and your lab section instructor told 
us you would be particularly good material for a good 
group. You know, we had some idea of putting people 
together who were congenial, but that didn’t work out 
because of schedule difficulties. All we could do was to 
take into account the objections you stated.” Then the 
instructions were given. 


For high cohesiveness, the experimenter stated: “This 
is part of a study in the use of imagination. We select 
at first the pairs of people to work together by means of 
the questionnaire you filled out in class (although the 
part about putting congenial people together didn’t work 
out because of schedule difficulties; all we could do was 
to take into account the objections you stated). We try 
to put people together who should be especially good at 
this kind of task. We checked on assignments with your 
lab instructor. From all we could learn, you have all 
the qualifications which have been set up to be good in 
this task: you two should be about the best group we 
have had. So we want to use you as a model group after 
which we can train other people to be more productive 
in this task.” Then the instructions were given. 


Negative Treatment. To minimize all 
forces to belong to the group, the attraction 
to the partner, the outcome of task, and the 
pleasure of the discussion itself were put in 
a dim light. 


After the instructions were given, the experimenter 
said, “I am sorry, but the idea of putting people together 
who are congenial didn’t work. Especially in your case 
we had some trouble because of scheduling. So the 
fellow you are going to work with may irritate you a 
little, but I hope it will work out all right. The trouble 
is that the whole thing is quite frustrating and the con 
versation somewhat strained, so we would have preferred 
to have you with a person you liked. But, anyway, do 
the best you can.” 


In addition to the talk by the experimenter, 
some treatments were stressed by the head- 
ings of the paper on which the subjects 


wrote their storics—for instance, “prize 
group” for task-directed and “model group” 
for prestige high-cohesive groups. 

Ten groups were used in each treatment. 
Both members of each pair were of the same 
sex. In each treatment, seven pairs were male 
and three female. Assignment of a pair to 
a treatment was a matter of chance, inde- 
pendent of the answers to the questionnaire. 
These were too vague and general to be of 
any use for this purpose, even if that had 
seemed desirable. One exception in discard- 
ing the questionnaire results had to be made: 
subjects were assigned to a condition where 
personal attraction was important, only if 
they had made a reasonable amount of speci- 
fication about their partners. 


Measurement 


The Measurement of Influence. Influence 
could be measured by the change from the 
preliminary story to the final story. In order 
to arrive at a numerical measure of the 
change, the stories were broken down into 
small units, and the amount of change could 
be measured by the change of these elements. 

The stories were coded in 12 categories. 
These could be grouped under five major 
headings: (1) setting, (2) relationship of the 
two people involved, (3) order of the pic- 
tures, (4) plot of the stories, and (5) general 
characteristics of the stories. Some cate- 
gories, like location or order of the pictures, 
could have only one entry; some, like literary 
level of the story and emotional relationship 
of the two men pictured, had subdivisions 
which actually were different points on a 
continuum; others, like special features in the 
setting and incidents in the story, were like 
checklists, which could be multiple-coded. 
The last type of categories made the code 
flexible enough to accommodate the con- 
siderable range in elaboration of the stories, 
while the scale-type of categories helped in 
determining the direction of the change. 

The changes were determined only by 
comparison of the codes without going back 
to the original stories. Any difference in the 
coded stories, omissions or additions, were 
considered changes. These could then be 
separated into those toward the partner’s 
position and independent changes. ; 

Changes toward the partner were consid- 
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ered those which tended toward the position 
the partner had shown in either his first or 
final story. In the checklist-type categories, 
certain items could be grouped into related 
clusters; a change toward a category where 
the partner’s story contained one in the same 
group was considered a change toward the 
partner. in the other types of categories, the 
classification was self-evident. All changes 
which did not meet these criteria were con- 
sidered to be independent changes 

The stories were coded by two people sep- 


. 
arately, and the differences were then recon- 


ciled into a final code. As a reliability check, 
two groups in each treatment were coded 


independently. The change score from this 
coding correlated with the final code +.69 
for the changes toward the partner, and 
+-.65 for the independent changes (N=28). 

The Recording of the Communication. 
process itself was re- 
who afterwards 


The communication 
corded by two observers, 
rated the total discussion. 

One observer noted all the communication. 
His observation blank contained 20 cate- 
gories, which fell into three groups: 
~ One group contained all the methods 
which could be used to influence the partner. 
Examples are stating one’s own position, 
reasoning, emotional arguing, or repeating 
the same argument. 

The second group contained the reactions 
to attempted influence. There were five such 
categories, arranged along an acceptance- 
rejection dimension: accepting the partner’s 
story, doubting his own story, stating that 
there between _ stories, 
counter-arguing, and categorical rejection. 
They were given arbitrary weights from 1 
a mean level of reaction 


was a difference 


to 5; from them 
could be computed. 

The categories which were not concerned 
with influence attempts made up the last 
group. These were as diverse as aggressing 
against the experiment, bringing up new 
ideas, asking questions, and asking to be 
influenced. 

The units coded were basically total utter- 
ances by one subject, as long as category and 


content remained the same. Sometimes this 


Friedman, Harold 
Ivan Kelly, and Wayne Pangborn, who acted as 
rs during the experiment 


2 The author thanks Ida Brown, Sue 
Kelley 


observe 


system would have resulted in units too large, 
particularly when the plot was related in 
detail. In cases like this, a new tally was 
made every 15 seconds. 

The second observer noted only the 
attempts to influence. From his observa- 
tions the strength of attempted influence 
could be measured. He classified the attempts 
used into 17 categories, such as, assertion, 
hypothetical example, rhetorical question, and 
exhortation. One sentence was scored as a 
unit. 

Weighting factors were assigned to the 
different categories in the following man- 
ner: each observer (five observers alternated 
in this task) rated the influence attempts 
which he noted on a four-point scale of 
intensity. The dispersion of the ratings of 
the different observers varied greatly. There- 
fore, mean values for each category were 
computed in standard scores for each ob- 
server. These scores for all observers were 
combined, weighted by the number of obser- 
vations on which they were based. If one 
observer deviated considerably from the rest 
on any category, his score was omitted be- 
cause presumably he had difficulty in under- 
standing the meaning of this category. The 
total range of these scores was divided by 
five; the categories whose scores fell in the 
top fifth were assigned a weight of 5, in 
the next fifth a weight of 4, etc. The amount 
of influence attempted by one person during 
a period of time was computed as the number 
of units scored weighted by the factor: of the 
categories in which they occurred. 

Reliability of this measure was checked in 
three different groups by having two ob- 
servers rate the same meeting and then com- 
paring the values they obtained for “number 
of observations” and “strength of attempted 
influence” for each partner, minute by min- 
ute. For the number of observations the 
correlations are +.91, +.78, and +.64, and 
for strength of attempted influence, +.87, 
+.68, and +.61. 

After the meeting both observers attempted 
to characterize the whole discussion by a 
pattern of discussion. Although they were 
permitted to distinguish five patterns for pur- 
poses of analysis, these were reduced to two 
main types, active patterns and withdrawing 
patterns: 
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Active patterns implied acceptance of the 
discussion situation where the main emphasis 
of the discussion was on discovering the im- 
portant facts in the pictures, on reaching an 
agreement, or on arguing for argument’s 
sake. 

Withdrawing patterns implied little in- 
volvement in the situation. They included 
discussion which consisted mainly of telling 
the stories without additional comments or 
of agreeing that the problem was too 
indefinite. 

A specific type of pattern was assigned to 
a group when both observers checked the 
same one. Agreement was reached in 63 of 
the 70 groups. 

Inasmuch as the observers administered 
the instructions, they always knew which 
type of group they were observing. They 
were, however, mainly unaware of the nature 
of the hypotheses under investigation. There- 
fore, it is unlikely that they would have 
biased the results. The principal measure 
derived from the observation—strength of 
attempted influence—was derived so indi- 
rectly from the actual observations that any 
bias is excluded for the measure. The cate- 


gories could only be weighted after all ex- 
perimental sessions had been concluded. 


Other Measures. Additional ratings and 
questions were used, in addition to the prin- 
cipal measures of communication and influ- 
ence, to probe the subjects’ feelings during 
the experiment. Only a few of these meas- 
ures proved useful; these will be discussed 
in the next section. 

A sociometric scale was designed to meas- 
ure the effects of the experimental situation 
on interpersonal relationships. The scale 
consisted of seven questions which were 
known to correspond to different strengths 
of attraction. The questions were scaled by 
an abbreviated Thurstone technique (2). 
The questions used were selected from a set 
of 40 original questions by a group of judges 
which consisted of 17 students in social psy- 
chology. Each judge divided the statements 
into seven groups according to the desire for 
intimacy expressed. The questions which 
showed the smallest dispersion and for which 
medians corresponded to the seven integers, 
were used in the scale (scale value in paren- 
theses), as follows: 


would like to see him around campus sometime 

would want to have him in the same lab section 

would enjoy talking to hira 

would enjoy an animated discussion with him 

would like to discuss serious general problems 

with him 

would want him to come to me with his problems 

I would discuss important personal problems with 


him 


In the course of the experiment it was 
found that 71 per cent of the 140 subjects 
gave a perfect scale pattern (that is, a “yes” 
answer to any question implied a “yes” to 
any question with a lower scale value). An 
additional 16 per cent gave scale patterns 
which were only one point off. It seemed 
justified to take the total number of “yes” 
answers as the score assigned by the scale. 
This procedure seemed better than using the 
highest scale value of questions answered 
“yes” as the score. In general, as we have 
just seen, the two methods would have the 
same result. But individual peculiarities 
sometimes made subjects answer positively a 
question which did not seem to fit their gen- 
eral patterns. These answers would have 
made the scores more erratic than in the 
second possible method. But using a score 
based on the number of relationships selected 
gave a steadier and, it was assumed, a more 
valid score. 


REsuLts 
Strength of Cohesiveness 

Effects on Communication. In a high 
cohesive group, it is our hypothesis that the 
members will try to come to an agreement 
on differences in point of view. Discussion 
on relevant topics, then, should be sought, 
and its importance should be accepted. 

The patterns of discussion provide a first 
test of this hypothesis. It will be recalled 
that the total pattern was rated by the ob- 
servers as to whether it implied active par- 
ticipation or withdrawal from the group 
discussion. In the low cohesive groups, the 
active patterns predominate. Of 26 of these 
groups, 19 were rated as having withdrawing 
patterns. In 27 high cohesive groups only 11 
showed withdrawing patterns, while 16 
showed active patterns. (In four low co- 
hesive and three high cohesive groups, no 
agreement between the observers could be 
reached.) This difference is significant at the 
2 per cent level (chi-square test). The 
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dominant behavior in the active class—argu- 
ing, seeking agreement, and seeking: facts— 
implies a considerable attempt to influence 
the partner. This over-all measure indicates 
that low cohesive groups react to realization 
of difference by withdrawing from the situ- 
ation, while high cohesive groups tend to 
eliminate the difference. 

On the more molecular level, the impor- 
tance of the discussion for the partners is 
indicated by the reaction to the partners’ 
attempts at influence. An average reaction 
level could be computed for the five cate- 
gories in which the observation of reaction 
was recorded, as the mean of all the values 
of these observations. 


nounced, however, if the mean of the most 
“objecting” reactions in each group is used. 
This is 3.83 for high and 3.10 for low cohesive 
groups. There is just as much agreement in 
both types of groups, but in the high cohesive 
groups it is accompanied by serious argu- 
ment, while in the low cohesive groups, it 
seems to mean mere politeness. 

Self-ratings on resistance confirm the inter- 
pretation that the more argumentative level 
in the high cohesive groups does not mean 
greater resistance to the partner’s arguments. 
These ratings show a slight decrease in resist- 
ance, not statistically significant, with all 
three bases of cohesiveness. 

Observation has shown that more influence 


TABLE 1 


Levet oF REACTION 








PERSONAL ATTRACTION 


Task DireEcTION 


Group PRESTIGE NEGATIVE 





Low Cohesive .10 


High Cohesive -49 





en V_ strength ' 
~ V within groups 


Table 1 shows that the level of reaction 
was higher in the more cohesive groups. 
These groups tend more toward argument 
and serious consideration of the partner’s 
position than the less cohesive groups. It 
may seem surprising that the more cohesive 
groups show more outward signs of resist- 
ance, like objecting to the partner’s story. 
From our interpretation of the reaction level, 
however, it may be suggested that argument 
against the partner is not a real indicator of 
resistance; rejecting the group as a reference 
group would imply polite agreement as a 
means of avoiding entering the discussion at 
all. Giving expression to disagreement sug- 
gests a more important role for the discussion 
and offers opportunities for later agreement. 

This interpretation derives some support 
if we consider the extremes of reactions with 
different strength of cohesiveness. Taking 
the mean. of the most “accepting” reactions 
occurring in each group, we find no differ- 
ence between high and low cohesive groups. 
The mean value of these “minimum” reac- 
tions is 1.47 for low cohesive and 1.43 for 
high cohesive groups. The difference is pro- 


2.22 .38 


2.85 -50 





= 3.91; df =1 and 54; p < .06 


is being exerted in the more cohesive groups. 
Conversely, the participants feel that more 
pressure has been exerted on them. In post- 
session interviews the subjects were asked, 
“Did you think that your partner tried to 
influence you?” Less than half (21 of 45) 
of the members of the low cohesive groups 
reported that they felt some pressure, while 
more than two-thirds (36 of 51) of the mem- 
bers of the high cohesive groups did so. The 
remainder did not answer adequately for 
coding. This difference is significant at the 
2 per cent level (chi-square test). The sub- 
jects’ own impressions confirm the validity 
of the observations. 

Acceptance of the discussion group as a 
meaningful reference point means more, 
however, than a stronger effort to come to an 
agreement with the partner. It should be 
manifested also by a great acceptance of the 
partner as a participant in the discussion. In 
general, we can assume that, because of indi- 
vidual differences, one person will be more 
interested than another in convincing his 
partner of the superiority of his story. But 
if pressures from the group are great, they 
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will affect both members strongly, and the 
ultimate effect of individual differences will 
be less pronounced. Further, the partners 
should try to adjust to each other, and give 
each other a greater opportunity to press their 
points. The total effect would be that influ- 
ence attempts are more evenly distributed in 
high cohesive than in low cohesive groups. 

Table 2 confirms this hypothesis: the mean 
percentage of attempted influence for the 
higher “inducer” is above 60 per cent in all 
low cohesive conditions, while it falls to 54-59 
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in the low cohesive groups, while no such 
differences are shown in the high cohesivz 
groups. As we have seen above, in the high 
cohesive groups the difference between high 
and low inducer is smaller too. It is unlikely, 
however, that the difference between high 
and low cohesive groups can be accounted 
for in this way. The difference in attraction 
between high and low inducers does not 
diminish in proportion to the decrease in 
attempts at influence by the higher inducer; 
but it vanishes completely. In two of the 


TABLE 2 


PERCENTAGE OF ATTEMPTED INFLUENCE IN HIGHER INDUCER 








PERSONAL ATTRACTION 


Task Direction 


Group PRESTIGE NEGATIVE 





Low Cohesive 62.4 


High Cohesive 58.9 


60.2 


64.6 
60.0 


54-9 56.7 





__ V strength 


= V within cells — 6.98; df =1 and 


545 P< .02 


TABLE 3 


Extent To wHicH HicH anp Low Inpucers Lixr. Tuer PARTNERS 








PERSONAL ATTRACTION 


Hicu 
INDUCER 


Hicu 
INDUCER 


Low 
INDUCER 


Task D1rEcTION 


INDUCER 


NEGATIVE 


Hicn 
INDUCER 


Group PRESTIGE 


Hicu Low 
InpucER INDUCER 


Low 
INDUCER 


Low 





7 3.8 4-60 


Low Cohesive 4: 


High Cohesive 4-25 4-35 3-90 


3.20 4 


-55 3-90 


4-15 4-50 4-40 





V strength of inducers 





Low cohesive groups: F = V within cells 


High cohesive groups: F not significant 


per cent in the high cohesive groups. In only 
nine of the 30 high cohesive groups does one 
partner account for more than 60 per cent of 
attempted influence, while this occurred in 
more than half of the low cohesive groups. 

In line with the hypothesis that attempted 
influence is a question of personal preference 
in low cohesive groups while it is made nec- 
essary by the pressures toward uniformity 
in high cohesive groups, we can expect mem- 
bers of high cohesive groups to accept their 
partners’ greater share of influence attempts. 
We can test this by comparing the scores on 
the sociometric scale given to high and low 
inducers in the different treatments. In 
Table 3, we see that the lower inducers like 
their partners less than the higher inducers 


= 6.097; df =1 and 54; p< .03 


three high cohesiveness treatments, the high 
inducer shows a slightly lower attraction; 
this is actually the reverse of the situation 
with low cohesiveness. 

Until now, we have limited the discussion 
to the six treatments in which some degree 
of cohesiveness was created. In the negative 
condition, on the other hand, the forces 
toward the group were kept at a minimum. 
Without any forces of this kind, there was 
no pressure toward uniformity within the 
group. But acceptance of the experimental 
situation, interest in the problem itself, and 
a desire to help the experimenter combined 
to make the subjects try to make something 
of the discussion. 

The reaction level of the negative groups 
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is 2.25, which is close to the average reaction 
level (2.33) of the low cohesive groups 
(Table 1). The high inducers in these 
groups account for 60 per cent of the total 
attempted influence just as the percentages 
in the low cohesive groups were 60 per cent 
or more (Table 2). They agree, too, with 
the low cohesive groups in that the high 
inducers were more attracted to partners than 
the low inducers (Table 3). 

The members of negative groups, however, 
attempted more influence than those of the 
low cohesive groups. Six of the 10 discus- 
sions in these groups were rated as having 
“active” patterns. In the same way, nine of 
15 subjects in this treatment reported that the 
partner tried to influence them. This, too, 
is a similar proportion to that of the high 
cohesive groups. 


toward features of the partners’ stories when 
cohesiveness increases. That this represents 
influence and not increased motivation, and 
hence a greater willingness to change and 
improve the story, can be seen from a com- 
parison with changes which were not in the 
partner’s direction. These changes (which 
cannot be ascribed to the influence of the 
partner) increase slightly only in two of three 
conditions. The mean of the low cohesive 
groups is 5.3, of the high cohesive groups 5.7; 
this difference is statistically not significant. 
It would seem therefore, that the change in 
Table 4 does represent influence and not a 
greater desire to improve the story. 

The increase in total change within the 
group does not give an adequate picture of 
the manner in which influence changes with 
increase in cohesiveness. We have shown 


TABLE 4 


CHANGES INFLUENCED BY THE PARTNER 








PERSONAL ATTRACTION 


Task DrrecTIONn 


Group PRESTIGE NEGATIVE 





Low Cohesive 


High Cohesive 


8.9 


Ir.o0 





V strength 


V within cells — 


The foregoing can be interpreted as indi- 
cating that in the negative groups there was 
little acceptance of the other member of the 
pair as a partner in the discussion; the sub- 
jects do not seem to consider the discussion 
as a serious step in establishing an idea about 
the stories. But, on the other hand, they are 
much freer in expressing their opinions and 
pushing their own ideas. 

Effects on Influence. We have seen that 
increase in cohesiveness raised the pressure 
toward uniformity in the communication 
process: the subjects tried harder to influence 
their partners, and they were somewhat more 
willing to accept their partners’ persuasion. 
Therefore, they were more influenced by the 
discussions. 

Table 4 shows the amount of influence 
which was shown by both partners. In the 
preceding section the measure was described, 
which consisted of changes in a code of the 
stories. There is a definite increase in change 


df=1 and 54; p<..11 


before how uneven distribution of change 
within the group can be taken as a sign of 
strong pressures toward uniformity. In line 
with this, Table 5 shows how much the 
partner who changed more and the partner 
who changed less were influenced in each 
treatment. We can see that almost the total 
increase in influence is the function of one 
member of the pair. As we expected, the 
greater pressure toward uniformity in the 
high cohesive groups results in the possibility 
that some members can be influenced quite 
strongly; as long as the agreement is reached 
at some point, perhaps close to the original 
position of one of the partners, it does not 
matter whether some members will show 
much change and some only a little. In low 
cohesive groups, however, both partners can 
merely show small and approximately equal 
changes. 

The negative groups show an average 
amount of change which is mainly borne by 
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one member of the group. This result seems 
surprising, as it would make the negative 
groups very similar to the high cohesive 
groups. We shall see later, however, that the 
meaning of change is different in these 
groups and that different members of the 
group are primarily affected. 

None of the variables which were meas- 
ured in the experiment discriminate between 
high and low changers in all types of co- 
hesiveness. It will be seen later that the 
proportion of influence attempts exerted dis- 
criminates between them if cohesiveness is 


group activity, and prestige of belonging 
to the group itself. We shall attempt now 
to explore further the meaning of these 
conditions. 

If an individual is attracted to a group 
because he wants to be with some of the 
members, he will consider the group activity 
mainly a means of meeting them. We should 
expect therefore that he will try to be pleasant 
and active with less regard to the perform- 
ance of the group activity as such. On the 
other hand, features of the interaction process 
may have strong effects on the interpersonal 


TABLE 5 


CuHance Towarp THE ParTNER: HIGHER CHANGERS AND Lower CHANGERS 








(a) HicHer CHANGERS 


PERSONAL ATTRACTION 


Task DireEcTION 


Group PRESTIGE NEGATIVE 





Low Cohesive 


High Cohesive 


5.6 4-8 


7.3 6.1 





(b) Lower CHANGERS 


PERSONAL ATTRACTION 


Task D1REcTION 


Group PREsTIGE NEGATIVE 





Low Cohesive 


High Cohesive 


3-3 


3-7 





V strength 


High Changers: F = 


V within cells 
Low Changers: F not significant 
based on group prestige or if attraction 
toward the group is minimized. 

The data presented in this section show 
that cohesiveness can indeed be considered 
as a unitary concept, although the increase in 
cohesiveness corresponded to very different 
operations in the various treatments. We 
could predict the same effect in each case by 
deriving the consequences of increasing the 
attraction of the group. 


The Basis of Cohesiveness 

Cohesiveness has been defined before as the 
strength of the resultant force toward mem- 
bership in the group. The force may be 
directed toward the group under a variety of 
circumstances in which the real goal may be 
quite different. In the experiment, cohesive- 
ness was created by using three different bases 
of the force: personal attraction between 
members, an attractive activity mediated by 


= 4-78; af =1 and 54; P < 05 


relationship. It is possible, for instance, that 
resistance to influence would affect his reac- 
tion to his partner more in this than in other 
conditions. 

If an individual enters the group to achieve 
ulterior goals, we can expect him to try to 
perform the required task as efficiently and 
as fast as possible. There should be less effort 
to establish a relationship with the other 
group members except insofar as it is neces- 
sary to perform the work successfully. 

If an individual enters a group because 
membership as such is attractive, we can 
expect that he will be concerned about his 
behavior in order to stay in the favored posi- 
tion. His behavior toward the other group 
members will be determined by his percep- 
tion of them as parts of the environment in 
which he has to succeed. He should adjust 
quickly to their attitude toward him; we 
should expect, therefore, rapid development 
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of complementary personal roles and a con- 
scious effort to show good behavior. 

Some of the measures obtained in the ex- 
periment support these descriptions of dif- 
ferent sources of cohesiveness. In discussing 
them, we shall compare these different types 
and also relate them to the negative groups. 

The Effects of Personal Attraction. Sev- 
eral signs in the observations of the personal 
attractiveness treatment suggest that the dis- 
cussion gave more attention to influences as 
such and was more related to interpersonal 
relationship than in the other conditions. 

From the observations of the discussion we 
find some evidence of the increased attention 
to the process of influence. One measure- 
ment of this tendency is the number of 
groups in which the category “asks to be 


position”—is less represented than in the 
other conditions. In these groups, “stating 
position” accounts for 10 per cent less of the 
attempts at influence than in the other 
groups, while methods like “reasoning” and 
“seeking agreement” are relatively preferred. 

A further suggestion on how personal the 
influence process becomes in this treatment is 
shown in the analysis of the sociometric 
scores. With high personal attraction, the 
high changer likes his partner less than his 
partner likes him. This difference amounts 
to two steps on the sociometric scale. This 
difference is significant at the 1 per cent level. 
There is no significance in the other treat- 
ments. Only here the group member whose 
attempts had not been effective rates his 
partner low. 


TABLE 6 


Time oF Discussion 


(seconds) 


PERSONAL ATTRACTION 


High Cohesive 


f not significant 


influenced” was coded by the observer who 
recorded all communications. The category 
included statements like, “What do you think 
I should put down here?” Statements of this 
kind occurred quite rarely. But they were 
noted in 10 of the 20 groups in the personal 
attraction treatment and in only five of the 
other 50 groups. The low cohesive groups 
of this treatment account for seven of the 10 
groups where this behavior was noticed. 
This curious fact is somewhat puzzling. If 
it is not due to chance, it may be explained 
by the low pressure toward uniformity and 
strong interest in the other member which 
we find in the low cohesive groups. This 
would result in influence and agreement being 
sought after, especially in this treatment. 
The types of attempted influence, which 
the same observer recorded, give evidence in 
the same direction. The personal attraction 
groups favor the more direct approach while 


the most distant method—“stating one’s own 


Task DireEcTION 


Group PRESTIGE NEGATIVE 


5 


t 2.91 &= 3.05 


P< «.O1 p< .o1 


The effect of the relationship in the per- 
sonal attractiveness condition emerges there- 
fore as increased attention to the influence 
process and a personal attitude toward the 
effects of the discussion. 

The Effects of Task Direction. The rela- 
tionship created by setting up a goal which 
can be reached by the group activity tends 
to have somewhat opposite effects from those 
of the personal attraction relationship. Group 
activity is seen as a necessity which is to be 
completed as quickly and as efficiently as 
possible. 

The intent toward accomplishment is 
shown in the average decrease of 95 seconds 
in the time taken for the discussion when 
cohesiveness increases (Table 6). The de- 
crease is relatively large and occurs only in 
this condition. The communication situation 
does not seem to have any attractiveness by 
itself. This shortening of interaction does 
not mean any withdrawal from the situa- 
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tion; however, the discussion becomes more 
intense. This is indicated by the strength of 
attempted influence per minute. It increases 
correspondingly to the decrease in time be- 
tween the low and high conditions. This 
increase is almost statistically significant (11 
per cent level). There is no comparable 
increase in the other conditions. There, 
attempted influence increases with cohesive- 
ness because the time of discussion increases, 
while in the task direction condition, the 
intensity increases. This suggests that the 
force exerted increases with cohesiveness but 
the way in which this increase is effected 
depends on the basis of cohesiveness. It 
seems that the relationship established here 
makes the group activity a means to an end, 
and the partner is just a tool in the process. 


occurred. Table 7 shows the combined 
changes, both toward the partner and inde- 
pendently, and we see that the prestige 
groups clung most to the original story.* 
This may be interpreted as an avoidance of 
the discussion situation. 

If the complementary relationship between 
the partners is established here, it should 
result in an unequal distribution of influence. 
We have seen that there is a general tendency 
in this direction in the high cohesive groups. 
Evidence is given in Table 8 that, in the 
prestige groups, this differentiation is 2 func- 
tion of the amount of attempted influence by 
the group members. We see that in groups 
of this kind, especially with high cohesive- 
ness, the low inducer changes more than the 
higher inducer. This would suggest that, in 


TABLE 7 


Cuances oF Att Kinps 


PeRsONAL ATTRACTION 


Group PRESTIGE NEGATIVE 








Low Cohesive 
High Cohesive 


_— V tvpes 
™ V within cells ~ 

mediation of 
increases the 
the exclusion 


These data suggest that the 
goals by the group activity 
efficiency of the discussion to 
of interest in the activity and in the partner. 


The Effects of Group Prestige. We have 
suggested before that cohesiveness based on 
group prestige will have the following impli- 
cations: Members will be careful of their own 
behavior, guiding their actions by some gen- 
eral notions of how they are supposed to act. 
As they focus their attention on their own 
proper behavior, the partner becomes the 
background in this situation, though a very 
important one. They will therefore adjust 
quickly to their partners’ behavior, and a 
mutual adjustment of personal roles will 
result. 

We should expect that the feeling of being 
“on the spot” would result in wariness dur- 
ing the experimental situation. We have 
seen that in these conditions the discussion 
tends to be short—an average of 335 seconds 
(Table 6). Further, relatively little change 


this treatment, making the larger change 
corresponds to a submissive role. 

It may seem surprising that the small dif- 
ferences in attempted influence (Table 2) 
should result in such a definite differentiation 
of role. The more equal distribution of 
attempted influence in the high cohesive 
groups was said to result in part from a 
growing consideration of the partner in the 
discussion. It would be reasonable to sup- 
pose that under the stress of the group 
prestige situation, a conscious effort was made 
to let the partner have his say, particularly 
by the member who felt in control of the 
situation. If we assume that this effort will 
be made after the relationship is established, 
we could expect that the difference in 
attempted influence in the first part of the 

8 That the difference between these bases of cohesive- 
ness derives mainly from this condition is seen from 
t-tests between the individual means. 

Task direction—Group prestige 


Personal attraction—Group prestige 
Task direction—Personal attraction 


t= 2.53,pP < .02 
= 1.73, p79 < .08 
t= 0.80, p < .45 
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discussion would be quite large but would 
vanish during the later part. Analysis of the 
difference in attempted influence in the first 
and second half of the discussions bears out 
this hypothesis. Of all conditions, only the 
high prestige group showed an appreciable 
difference between the first and second half 
of the discussion. The difference between 
the partners dropped from 8.3 “attempt units” 
to virtual equality between the partners (0.9 
units). This is the closest the two partners 
came to equality in any of the treatments. 
The difference between the two parts of the 
discussion in these groups is significant at 
the 5 per cent level. We can conclude that 
a situation such as that of the high prestige 
treatment demanded extreme responsiveness 


In this way, they contrast most with the nega- 

tive pairs, where cohesiveness was reduced to 

a minimum, and which therefore should be 

very different from functioning groups. 
CONCLUSIONS 

The results of the experiment show in a 
concrete fashion the manner in which the 
hypotheses which were stated in the introduc- 
tion were confirmed. The pairs of subjects 
in the experiment formed groups in which 
pressure toward uniformity was related to 
the cohesiveness of the group. 

Within this setting the results show that 
an increase in cohesiveness, independent of 
its nature, will produce the following 
consequences: 


TABLE 8 


Cuances oF HicH anp Low INbucERs 





PERSONAL AT 


Hicu 


INDUCER 


TRACTION 
HIGH 
INDUCER 


Low 
INDUCER 


Task DtreEcTION 
Low 
INDUCER 


NEGATIVE 
Hicu 


INDUCER 


Group PRESTIGE 
Hicn 
INDUCER 


Low 
INDUCER 


Low 
INDUCER 








Sess 1.6, P< 2 
722 34k 0 < 28 
609 s = 3.35, 9 < 023 


Remaining differences not significant. 


Note 


to the partner’s behavior and that a definite 
relationship was quickly established. But the 
remainder of the discussion would not cor- 
respond to the actual roles but. to the subjects’ 


notions of a good group. 

Table 8 shows the negative condition in 
striking contrast to the high prestige con- 
dition. Here the higher inducer changes 
most. We would conclude that the partner 
was disregarded here, and the individual 
tries to discuss possible changes only if he 
himself is willing to change. No genuine 
interaction seems to be involved but, rather, 
two people acting independently and con- 
vincing only themselves that they should 
change. 

The high prestige groups seem to exhibit 
relationships between personal roles and 
acceptance of influence, a property which we 
generally associate with functioning groups. 


Two groups with tied scores of attempted influence excluded. 
g I 


1. In the high cohesive groups the mem- 
bers made more effort to reach an agreement. 
Both the ratings of the total discussion and 
direct observation showed more serious effort 
to enter the discussion in highly cohesive 
groups. The subjects’ own statements also 
confirmed the high pressures in these groups. 

2. Behavior in the highly cohesive groups 
was more affected by the situation than in 
the low cohesive groups. The amount of 
attempted influences measured in highly 
cohesive groups showed less individual dif- 
ferences, and those differences which did 
exist were not considered on a personal level. 

3. In the highly cohesive groups the dis- 
cussion was more effective in that it pro- 
duced influence, that is, group members 
changed more toward the partners’ positions 
than they did in the less cohesive groups. 

4. In the highly cohesive groups the change 
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was quite unevenly distributed between the 
members, while in the less cohesive groups 
the changes were more evenly distributed. 
On the average, one member of the highly 
cohesive groups changed more than either 
member of the less cohesive groups; and the 
other member of the highly cohesive group 
was nearly the same as one member of the 
less cohesive groups. 

The four points summarize the effects of 
the forces to belong to the group, of cohesive- 
ness considered as a unitary concept. The 
differences among the ways in which co- 
hesiveness was produced led to the following 
interpretations about patterns of communica- 
tion and influence: 

1. If cohesiveness was based on personal 
attraction, group members wanted to trans- 
form the discussion into a longish, pleasant 
conversation. The discussion was taken as a 
personal effort, and rejection of persuasion 
tended to be resented. 

2. If cohesiveness was based on the per- 
formance of a task, group members wanted 
to complete the activity quickly and efh- 
ciently; they spent just the time necessary 
for performance of the task, and they tried 
to use this time for the performance of the 
task only. They tended to participate in the 


discussion only as much as they thought it 
valuable to achieve their purposes. 

3. If cohesiveness was based on group 
prestige, group members tried to risk as little 
as possible to endanger their status: they 
acted cautiously, concentrated on their own 
actions, and adjusted to their partners as the 
social environment. One partner would 
easily assume a dominant role, and the sub- 
missive member was influenced more, with- 
out their actually trying to establish this 
relationship. 

Finally, with cohesiveness at a minimum, 
the members of the pair acted independently 
and with little consideration for each other. 
As the subjects did not try to adjust to the 
other member of the pair, each member was 
concerned only with his own discussion. 
Influence, accordingly, did not depend on the 
action of the partner but on the interest of 
the member himself in entering the group 
activity. 
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VALIDITY OF THE CHICAGO ATTITUDE INVENTORY AS A 
MEASURE OF PERSONAL ADJUSTMENT IN OLD AGE 


BY ROBERT J. HAVIGHURST ! 


» Human Development, University of Chicago 


ERSONAL adjustment is a central problem 

in studies of old age. To be of greatest 

use in scientific studies, this concept 
must be clearly defined, and means of meas- 
uring it must be devised. One method of 
measuring personal adjustment is to secure 
from the subject a set of statements reporting 
his attitudes about himself and his activities 
and then to evaluate these statements. The 
Chicago Attitude Inventory has been devised 
for this purpose, and an initial report has 
been published on its reliability and validity. 
The present study is a further attempt to 
discover the degree of validity of this 
instrument. 

Your Activities and Attitudes? (1) is an 
inventory of activities and attitudes for older 
people. It includes the Chicago Attitude 
Inventory, which consists of 56 attitudinal 
statements in eight categories: Health, 
Friends, Work, Economic Security, Religion, 
Feeling of Usefulness, Happiness, and 
Family. 

The Inventory is based upon the following 
definition and theory of adjustment? A 
person who is well adjusted lives a life 
that is reasonably satisfactory to himself and 
meets the expectations of society reasonably 
This crude definition points to the 
fact that personal adjustment has two essen- 
tial subjective and objective. The 
subjective aspect of adjustment consists of a 
person’s feelings about himself and his life. 
The objective aspect consists of the person’s 
reputation, his status, and his participation 
and relationships which are 
judged desirable or undesirable in the society 
in which he lives. 

The Attitude Inventory attempts to measure 


well. 


aspects 


in activities 


1 The author wishes to acknowledge the assistance of 
the following pe le who did field work and analysis of 
n this project: Ruth Albrecht, John Flaherty, Edith 
Fleming. 

2 This schedule is reprinted and preliminary studies of 
its validity and reliability are reported in Personal adjust 
ment in old age (2). 

8 For a fuller account, see Chaps 
of Cavan, et al. (2) 


data 


II, VII, IX, and XI 


the person’s feelings about himself and his 
life. Its results are expressed in a total score 
and eight sub-scores, which may be taken a- 
indices of the individual's adjustment. 
Validation of this instrument may proceed 
in either of two ways. First, it may be 
checked against the results of a thorough and 
intensive questioning of the subject. The 
purpose of this more intensive questioning 
is to find out whether the respondent really 
feels the way he reports himself to feel— 
whether he is really as happy, content in his 
work, sure of his place in the affections of 
his friends, as he reports himself to be on the 
Inventory. This method was used in the 
present study. The interviewer was a person 
trained in psychology and interviewing, but 
without psychiatric training. He held one 
or more interviews with the respondent in 
addition to having him fill out Your Activi- 
ties and Attitudes Inventory. Then the 


interviewer rated the respondent’s adjustment 
in ways which will be described later. Two 
other judges who had never seen the re- 


spondents read the interviews and _ the 
Activities section of the Inventory, and rated 
the respondents. These judges did not see 
the Attitudes section of the Inventory. 

This test of validity proceeds on the prem- 
ise that if Attitude scores are highly cor- 
related with judges’ ratings—the ratings 
made on the basis of the interview material— 
the Attitude Inventory is a valid measure of 
personal adjustment. 

A second method of testing the validity of 
the Attitude Inventory is to check it against 
information concerning the respondent’s 
reputation, actual health, economic status, 
and social participation. The assumption is 
that good adjustment goes with good health, 
with economic security, with active partici- 
pation in the family, neighborhood, and com- 
munity, and with good general reputation. 

This test of validity proceeds on the prem- 
ise that if Attitude scores are highly cor- 
related with the more objective indices of 
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adjustment, the Attitude Inventory is a valid 
measure of personal adjustment. 

This second method was used in two ways: 
(1) the Attitude scores were compared with 
ratings from people in the community who 
knew the respondents, and with ratings from 
the interviewers who observed the respond- 
ents in their homes and often in their daily 
activities; (2) the Attitude scores were com- 
pared with the reports of respondents on their 
status and activities (the Activities section of 
the Inventory). 


Personal, intimate associations with family and friends; 

Group associations and contacts with the world through 
reading and radio; 

Individual activities and hobbies; 

Feelings of emotional security in 
farnily, or religion; 

Feelings of status or social recognition; 

Happiness and contentment. 


relation to friends, 


2. For a sample of 168 persons, ratings 
were made by people who knew the respond- 
ents. The ratings were made by means of a 
Check-List of 50 questions asking about 
activities or easily observable actions and cor- 
responding roughly to the categories of the 


TABLE 1 


CorRELATIONS OF ATTITUDE INVENTORY SCORES WITH SCORES ON VALIDATING INSTRUMENTS 


ForMER STuDY 


PRESENT STUDY 





VALIDATING PROCEDURE z 
No. oF 


SUBJECTS 


r wiTH TOTAL 
ArrirupE Score® 


r WITH TOTAL 
ArTirupE Score* 


Average of 3 ratings on Cavan Adjust- 
ment Scale by interviewer and research 
staff 


Activities score from Your Activities and 


Attitudes 


Check-List scores by mixed group of 


judges who knew the subjects 


Check-List scores by 
(total sample) 


interviewers 


Check-List scores by interviewers on 
subjects they knew best 


Check-List scores by community resi- 
dents 


Check-List scores by interviewers on 
sample rated by community resi- 
dents 





earenn os 
* Pearson pro 


Previous Data oN VALIDITY OF THE 
INVENTORY 

The previous data on validity of the Atti- 
tude Inventory may be summarized as fol- 
lows. The Attitude scores were compared 
with the following ratings or scores: 

1. For a sample of 100 persons, three mem- 
bers of the research staff read the Activities 
section of the Inventory and the narrative 
account of the interviews, and made a rating 
on the Cavan Adjustment Scale.* Their rat- 
ings covered the areas: 


4For a full description of the Scale, see Chap. XI, 
pp. 129-30 of Cavan, et al. (2). 


Attitude Inventory.” The raters were: uni- 
versity students of sociology (47 cases), case 
workers for Old Age Assistance (go cases), 
interviewers on the research staff (13 cases), 
and miscellaneous (18 cases). 

3. For a sample of 102 respondents, the 
Attitude scores were compared with the 
Activities scores obtained from the Inventory. 

The previous evidence on the validity of 
the Attitude Inventory is summarized in 
Table 1, in the form of product-moment cor- 

5 For full description of this Check-List and other 


instruments mentioned, see Chap. XI and Appendices of 
Cavan, et al. (2). 
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relations between the Attitude scores and the 
various ratings and scores from the validating 
procedures. 

THE ATTITUDE 


New ON VALIDITY OF 


INVENTORY 


Data 


Essentially the same methods of validation 
of the Attitude Inventory have been em- 
ployed in a study of the older persons of a 
typical small midwestern city. Certain fea- 
tures of the new study, however, make it 
possible to obtain a clearer estimate of the 
validity of the instrument. 

1. The respondents consist of a representa- 
tive sample of 98 persons. There was a total 
number of 690 people over 65 who lived in 
the community, about 75 per cent of whom 
were well enough to be interviewed and 
willing to be interviewed. The sample is 
representative of the total group in age, sex, 
social status, and marital status. In the 
earlier study, a disproportionately large num- 
ber of subjects were receiving Old Age 
Assistance. 

2. The judges using the Check-List in the 
present study the field worker who 
interviewed the respondent, and someone in 


were 


the community who knew the respondent 


fairly well. The judges in the previous 
study were, in half the cases, case workers 
on Old Age Assistance programs and, in 
most other cases, university students who 
were relatives or acquaintances of the re- 
spondents. It is probable that the field 
workers in the present study—all of whom 
had psychological training and none of 
whom had other than research interests in 
were in a better position to rate 
subjects objectively than were the students 
in the previous study. However, the case 
workers who served as judges in the previous 
study were probably as skilled in judging 
human behavior as were the field workers in 
the present study. The community residents 
who rated subjects in the present study knew 
their subjects fairly well; but they were 
handicapped in their use of the Check-List 
by ignorance of the subject’s experience in 
his home and family relationships. 

3. The Attitude Inventory used in this 
study is slightly different from the one used 
in the earlier study. Two categories used in 
the earlier study—“organizations” and “lei- 


the subjects 


sure”—were omitted, and about 10 new items 
were introduced in place of items on the old 
form which had proved not to discriminate 
between people with high and low scores on 
the check-list instrument. Thus the new 
form of the Attitude Inventory is somewhat 
shorter but also somewhat improved over 
the form used in the earlier study. 

Results of the validity studies are sum- 
marized in Table 1 in the form of product- 
moment correlation coefficients between total 
scores on the Attitude Inventory and scores 
on validating instruments. 

Two of the validating procedures used in 
this study give almost exactly the same re- 
sults as in the earlier study: the Activities 
scores of the inventory and the ratings on the 
Cavan Adjustment Scale.® 

The ratings on the Check-List, on the other 
hand, did not show the same correlations 
with Attitude scores as in the earlier study. 
In the earlier study, where there was a mixed 
group of judges using the Check-List, the 
coefhicient of correlation was .50. 

In the present study, there are four coefh- 
cients of correlation which can be compared 
with the earlier value of .50. 

1. For the totai sample of 98 cases, inter- 
viewers’ ratings on the Check-List correlated 
.64 with Attitude scores. 

2. Interviewers were asked to indicate 
which of the 98 subjects they knew independ- 
ently of the interview situation. That is, 
which of these old people did they know 
fairly well from observation at church, in 
social gatherings, on the streets, and from 
discussion with other people in the com- 
munity. The two interviewers selected 59 
of the 98 subjects in this way. For these 59 
cases, interviewers’ ratings on the Check-List 
correlated .66 with Attitude scores. 

3. Thirty-one cases of the 98 were rated 
both by the interviewers and by community 
residents who knew the respondents. For 
these 31 cases, when the Check-List was used 
by the interviewers the coefficient of corre- 
lation with Attitude scores was .68; when 

®In ratings based upon the Cavan Adjustment Scale, 
the interviewer acted as one of the judges. While the 
interviewer had seen the Attitudes section of the Inven- 
tory at the time the Inventory was filled out by the 
respondent, he made his ratings some months later. He 
was, therefore, presumed to be little influenced by the 


respondent’s answers on the Attitudes section of the 
Inventory. 





Cuicaco AtrrirupE Inventory as MEAsuRE oF Otp Act ADJUSTMENT 


the Check-List was used by community resi- 
dents, the coefficient of correlation was .12. 

Were the community residents better 
judges of the adjustment of the old people 
they rated than were the interviewers? 
Since there were only 31 cases in this group, 
it was not difficult to analyze the ratings of 
community residents to find the cause of the 
low correlation coefficient. Results of the 
analysis were as follows: 

One of the community residents, who filled 


7 
es 


rated. None of them had had any previous 
experience in making ratings. Most of them 
were substantially higher in social status than 
the persons they rated. None of them was 
a close friend of the person being rated 
except that one daughter rated her mother 
(and gave her the same score that she was 
given by the field worker). These facts sug- 
gest that the community judges may not have 
been good raters in this situation. On the 
other hand, they were representative of cer- 


TABLE 2 


Propuct-MOMENT CORRELATIONS BETWEEN SCORES ON 


THE CATEGORIES OF THE ATTITUDE INVENTORY 


AND BETWEEN CATEGORY ScoRES AND ToTAL Scores 


(Comparison of Present Study with Earlier Study) 


Fconomk 
SECURITY 


CATEGORIES Frienps Work 


Health 


.18 20 
. 28) (.1 
Friends _28 


7 
.24 
-31) (.27 
3 
5 


(.397)*° 


) 


Work . 
(.25) 
Economic Security 

Religion 

Feeling Useful 

Happiness 


Family 


ToTaL 
Score 


FEELING 


RELIGION Userut Happiness Famiy 





.60 
-34) 
-59 
-37) 
.60 


-31) 


-04 
(.13) 
.31 
(.26) 
.38 
(.20) 
. 66 .18 -75 
-17) -12) . 40) 
-31 -02 -14 
-21) -15) -32) 
.63 -37 -76 
. 46) 23) .61) 
-53 84 
25) (.65) 

-51 

(.39) 


-70 
-51) 


-55 
-56) 


75 
.60) 





* Figures in parentheses are correlation 
and females Although theoretically 


orrelation coefficients for males anc 


males 


unjustified, 
between 1 ter 


S$ was .09 


out seven check-lists (a larger number than 
was filled out by any other community resi- 
dent) rated subjects much lower than did 
the interviewers. This in itself would not 
normally reduce the correlation coefficient; 
but it did so in this case because his seven 
subjects all had Attitude scores which fell 
above the mean. Hence the low ratings they 
received from this man had the effect of 
reducing the correlation coefficient. When 
this man’s ratings are omitted, the corre- 
lation coefficient becomes .19. 

The remaining 24 check-lists were filled 
out by 11 different people. Two of them 
were on the staff of the county welfare de- 
partment, two were ministers, the others 
were acquaintances of the persons being 


efiicients from the ¢ 
the averaging could produce no marked error because the greatest difference 


rlier study (2. p. 134), obtained by averaging the coefficients for 


tain community attitudes and perhaps rated 
the subjects more accurately in terms of com- 
munity attitudes than did the interviewers. 

It is difficult to form a judgment concern- 
ing the relative validities of the ratings by the 
community judges and the interviewers. 
The writer holds that residents of the com- 
munity, provided they know the subjects well 
and have some training in rating, would be 
more valid raters than field workers who 
lived in the community only briefly and saw 
the subjects only briefly. But investigation 
makes it clear that the community judges in 
this case did not know the subjects very 
well. Moreover, it is possible that the Check- 
List, with a number of items about family 
relations and other non-public activities, is 
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not a good instrument to use with com- 
munity judges. 

The reason that the earlier correlation co- 
efficient between the Check-List and the 
Attitude scores was as high as .50 is probably 
the following: judges consisted mainly of 
two groups—social workers who rated re- 
cipients of Old Age Assistance who were 
their clients and students who rated elderly 
relatives. All these people were probably 
better trained in observing behavior than 
were the community judges in the present 
study, and the students, at least, probably 
knew the people they were rating better than 
did. the community judges in the present 
study. 


INTERNAL CHARACTERISTICS OF THE ATTITUDE 
INVENTORY 


The Inventory, when scored in the simple 
way recommended in the earlier study, gave 
a range of scores from 13 to 46 for the 98 


subjects of the present study. The mean was 
34 and the standard deviation was 7.3. 
Table 2 gives the intercorrelations for each 
pair of categories, and the correlations be- 
tween each category and the Total Attitude 


score. It is instructive to compare these data 
with the analogous data for the earlier form 
of the Inventory, also included in Table 2. 
The principal difference between the two 
sets of coefficients is that in the present study 
all but two of the categories have higher cor- 
relations with Total Attitude scores. Of the 
two remaining categories, the correlation 
between Friends and Total Attitudes remains 
the same, while that between Religion and 
Total Attitudes goes down. However, the 
rank orders of the coefficients are almost 
identical for the two forms of the Inventory. 
The lowest correlation between a category 
and Total Attitude score is that for Religious 
Attitudes. This coefficient is so low as to 
raise a question whether the Religion cate- 
gory should be retained in the Inventory. 
The very high coefficient of correlation 
between the Happiness scores and the Total 
Attitude scores (.84) suggests that the Hap- 
piness category might be used by itself as a 
short form of the Inventory. However, such 
a procedure should be used only with the 
greatest caution, for this category is probably 


the most open to the operation of self-deceit 
in a subject. It is more subjective and 
“attitudinal” than are most of the other cate- 
gories, which have statements that are quasi- 
objective. For example, the Happiness 
category contains the statement, “These are 
the best years of my life.” This is probably 
much less objective in its meaning to the 
respondent than such statements from other 
categories as, “I am happy only when I have 
definite work and 
money to get along.” 

Among the __inter-category 
Happiness has the highest correlation with 
other categories, and Feelings of Usefulness 
is next, while Religion and Family scores 
have the lowest correlations with other cate- 
gories. This was also true for the earlier 
form of the Inventory. 


to do,” “| have enough 


correlations, 


SUMMARY AND CONCLUSIONS 

The Chicago Attitude Inventory appears 
to have a satisfactory degree of validity as a 
measure of personal adjustment in old age. 
This is indicated by the following facts. 

1. Scores on the Attitude Inventory corre- 
late .78 with scores on the Activities Inven- 
tory, the latter being the subject’s self-report 
on health, economic status, and social and 
religious activities. 

2. Scores on the Attitude Inventory corre- 
late .73 and .74 with ratings of the subjects 
made by trained research workers. The rat- 
ings were made by analyzing interviews with 
the subjects, together with subjects’ responses 
to the Activities Inventory. 

A third criterion of validity gives incon- 
clusive results. This is the criterion of the 
judgments of persons who know the subject 
and have observed his way of life. When 


‘field workers who have interviewed the sub- 


jects rate them on a Check-List, their ratings 
show a correlation coefficient of .64 with 
scores on the Attitude Inventory. However, 
when community residents use the Check- 
List, their ratings s| correlation coefh- 
cient of only .12 with scores on the Attitude 
Inventory. There is reason to believe, how- 
ever, that the community judges were not 
very well acquainted with the subjects, and 
they were not trained in the use of rating 
procedures. 


1OW a 
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This third criterion is, in the opinion of the 
writer, the most important of the three cri- 
teria if the conditions for valid judgment by 
community judges can be met. These con- 
ditions were not met in the present study. 

On the whole, the Chicago Attitude Inven- 
tory is found to have a high enough degree 
of validity to justify its use with groups of 
older people who are not senile. The validity 
is not high enough, however, to permit draw- 


ing conclusions about individual cases with- 
out support from data of other kinds. 
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HisroricaL REMARKS 


HILOSOPHERS with epistemological inter- 

ests have often discussed the question 

how we came to know the content of 
other people’s minds. As the subject was 
treated in European philosophy, however, the 
main interest was in the logical justification 
for the belief in other minds, rather than in 
a psychological analysis of how the knowl- 
edge of other minds is actually ob ained. 
From Locke onwards, knowledge of other 
minds was treated as the result of an elabo- 
rate sequence of inferences. This process of 
inference, or logical construction, 1s a com- 
plicated fiction, which does not, and is not 
correspond with the ontogenetic 
The philoso- 


intended to, 
development of knowledge. 
pher, C. D. Broad, says: 

It seems to me to be absolutely certain that the 


in other minds and the belief that a certain 
certain 


belie! 
mind is having a certain experience on a 
casion are not reached by inference, even if they 
can be afterwards justified up to the hilt by infer 
certain that I do not now 
en I see my friend frowning 


And the notion that, 


ence. It is perfectly 
I > an inference wh 
lieve that he is 
iby, I began by looking in a mirror when | 
felt cross, noting my facial expression at the time, 


angry 


a similar expression from time time 
or nurse, and then argu- 


external bodies are prob 


observing 
the face of my mother 

g by analogy that these 
animated by minds like my own, is too silly 

to need the belief in other minds 
nd other mental! events were reached in this way, 
a bold speculative 
ingenious and 
of thirty-five (4, 


refutation If 


it might perhaps be entertained as 


opinion by a few exceptionally 
observant pe! mn at the ripe age 
4 

A recent philosophical analysis of how we 
understand the thoughts of others occurs in 
Findlay (12). 

Other philosophers have recognized the 
difference and have attempted to describe the 
process of obtaining psychological knowl- 
edge, as in Hutcheson’s discussion of “public 
sense ” (14, Sect. I, 1), or in Kant’s concept 
of “teleological judgment.” We have neither 
the space nor the knowledge to speak of the 
long line of German philosophical discussions 
that derive from Kant’s work. The writings 


of Dilthey, Stern and Spranger belong to this 
tradition. Some account of them is given in 
Allport (2, Chap. xix). Though this school 
produced much interesting discussion, it did 
not often result in experimental work. 


EXPERIMENTAL APPROACHES 


In the English-speaking world, Darwin's 
study (7) of the expression of the emotions 
initiated a good deal of empirical study. 
Darwin recognized the social role of emo- 
tional expression and held that there is 
appropriate instinctive recognition of the 
meaning of emotional expressions. Darwin's 
theories inspired the experimental studies of 
emotional behaviour in children made by 
Watson (27) and studies of the recognition 
of emotions by Sherman (20). There were 
also studies of the recognition of emotions 
from posed photographs (Feleky [11], Ruck- 
mick [16]). The whole subject is reviewed 
in Woodworth (28, Chap. xi). The result 
of these studies was on the whole disappoint- 
ing. They failed to achieve any decisive 
analysis of how good our knowledge is or 
how it is obtained. In common with other 
studies of the time, these experiments dealt 
with mind-in-general rather than with indi- 
viduals, and with momentary states rather 
than with more stable qualities of person- 
ality. This line of experiment seems to have 
petered out about 1928, after various sceptical 
studies had emphasized the imperfections of 
our judgment of emotions. 

A different line of approach came from the 
use of rating scales, which became popular 
after World War I. Several attempts were 
made to validate personality ratings. But 
here a logical difficulty presented itself. 
Judgments of personality traits described in 
the adjectives of ordinary speech are pain- 
fully vague. Their vagueness is not merely 
a defect in the observer but a result of the 
social use of language. There can be no 
exact agreement on the persons and acts to 
which adjectives like “brave” or “honest” 
should be applied. They are general descrip- 
tive terms for deviations from a norm which 
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is only vaguely defined by usage and social 
expectation, and it is no use expecting to get 
anything much more precise from such terms. 
Many experimental attempts at validating 
such ratings have been published. For 
example, Rugg (17) compared ratings of 
army officers with tests of behaviour, and 
also approached validity indirectly through 
the more accessible quality of reliability. 
Valentine (24) used the ratings of teachers 
who were well acquainted with the boys and 
girls of his study, as a criterion of validity. 
Adams (1) and Sears (19) used the pooled 
judgments of a number of observers. This 
appeal to democracy seems out of place be- 
cause judgments may be affected by various 
constant errors. ‘Taylor (22) attempted to 
validate judgments of personality from voice 
by comparison with self-ratings on a neurotic 
inventory, 2 judgment which is notoriously 
fallible. Vernon (25) and Estes (g) attempted 
validation of ratings based on some slight 
acquaintance, by ratings based on a careful 
study using a variety of personality tests. 
Though these are superior in technique to 
the earlier experiments, they still rest upon 
definitions that are inevitably vague and sub- 
jective. It is open to question whether the 
trait mames used by Estes are really all 
applicable to all the subjects rated. If a per- 
son’s behaviour is inconsistent on a trait of 
autonomy, so that he shows much autonomy 
in some situations and little in others, then 
no rating of the subject on this trait, however 
carefully considered, can have high validity, 
because the term is not really relevant to the 
subject, and he should not be held to possess 
any degree of the trait. Any attempt at 
direct validation of personality ratings must 
fail because of the vagueness and fallibility 
of the trait concepts available. Dissatisfied 
with the procedures described above, experi- 
menters have turned to two other methods of 
indirectly validating subjective judgments. 
These are matching and prediction. 


Tue Matcuinc MetHop 

The matching method has been consider- 
ably explored, and excellent surveys of its 
use are to be found in Vernon (26) and 
Allport (2). Typical studies using this 
method are Eysenck (10) and Swift (21). 
The method measures the ability of judges 
to match two kinds of expressive production 


or one expressive production with a rating 
or description by an acquaintance. For 
example, a judge is presented with six 
samples of handwriting, from which he 
attempts to assess the personalities of the 
writers. If he merely records what qualities 
he believes the writers to possess, his opin- 
ions, however well worked out, remain in- 
fected with subjectivity. But if he is given 
six brief character sketches of the writers and 
asked to match these with the specimens of 
handwriting, his judgment takes on a com- 
pletely objective character and can be un- 
equivocally validated or invalidated. His 
successes can be compared with those that 
would be obtained by chance, the null 
hypothesis being that no element of skill 
enters into the matching. The probability 
of any given excess over chance can be cal- 
culated by the usual methods. The matching 
technique is most effective for testing the 
validity of wild or speculative notions about 
character assessment. It shows whether there 
is anything in a method or not, but it is not 
a convenient device for analysing the basis of 
judgment. What hampers analysis is that in 
a matching experiment the various parts of 
the material to be matched are not independ- 
ent of one another. A pair of items that is 
easy to match in one series might be difficult 
to match in another series. As Vernon (26) 
has pointed out, this is similar to the prop- 
erty of the correlation coefficient, that its 
magnitude depends on the dispersion of the 
group tested. When the subjects whose 
products are to be matched vary markedly 
from one another, matching is easier than 
when differences are slight. Thurstone (23) 
shows that this influences a great variety of 
situations. 

Success in matching is also affected by the 
lengths of a series. When series are long, 
matching is more difficult, since the attempt 
to keep many possibilities in mind imposes 
a strain on the judge. On some occasions it 
is also possible to match by elimination, when 
all items but one have been, matched on posi- 
tive evidence. These are three ways in which 
the total series influences success on separate 
matchings. 


PREDICTIVE TECHNIQUES 


A method of validation that promises more 
opportunities for analysis is one based on the 
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prediction of specific acts. Positivist philoso- 
phers have often emphasized the importance 
of prediction for the validation of hypotheses. 
Schlick, for example, said: 


For the physicist as investigator of reality, the 
only thing of importance, the only determining test, 
that which is the sole essential, is that the equations 
derived from certain data also hold good for new 
Only if this is the case does he regard his 


In other words, the true 


data. 
formula as a natural law. 
criterion of regularity, the essential characteristic of 


causality, is the realisation of predictions (18). 


Psychologists have, on occasion, tried to 
use prediction for the verification of hypothe- 
ses. Healy (13), for example, tried to predict 
which of his delinquents would benefit from 


treatment. Bender (2) tested his own under- 


STATEMENTS 


1 am usually the one to make the necessary 
decisions when I am with another person. 


I am considered compliant and obliging by my 
fi 

I am disinclined to adopt a course of action dic- 
thers 


tated by « 


And so on for 15 more items. 


standing of persons he had interviewed, by 
attempting to predict their scores on various 
personality tests. Some tentative trials of 
predictive methods were made by the judges 
in Murray’s (15) experiments. Winslow’s 
experiment (28), in which judges were asked 
to predict how acquaintances would fill in a 
questionnaire on political attitudes, is much 
closer to our method than anything else 
encountered in a survey of the literature. 

The conditions requiring to be fulfilled 
by a good prediction test are these: 

1. The act to be predicted must be expressive; it 
must be an act that one would have little chance 
of predicting successfully on the basis of knowing 
mercly the kind of person one is judging; but an 
act that can be fairly well predicted if one knows 
the person individually 

2. The ability to predict must depend on insight 
into the meaning and pattern of the subject's be- 
haviour, not merely on detailed knowledge of his 
habits or memory for small facts. 

3. The prediction must be quite specific and 


objective, either based on the choice of a limited 
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and known set of alternatives or expressed in quan- 
utative form. 
4. The probability of chance success should be 


calculable. 

5. A considerable proportion of those who know 
the subject well should be able to make predictions 
that are superior to chance; and very few should 
be able to make perfect predictions. 

6. For the purposes of our experiment we added 
the further condition that no special technical 
knowledge should be required for making the 
predictions. 


These conditions were satisfactorily met by 
a questionnaire derived from those contained 
in Murray (14, Chap. 3). Eighteen items 
were selected, one from each of 18 traits: 
abasement, achievement, affiliation, etc. Sub- 
jects were required to rate themselves on a 


Se_r-RATINGS 


+1 +2 +3 


l 
— 


six-point scale, the instructions for which 
follow: 
Secr-Ratinc Test 
In this test you are asked to compare your he- 
haviour and emotions with those of most men and 
women of your own age. Read each statement 
carefully, and make up your mind whether it is 
more or less true for you than it is for the average 
person. Then make a tick in the proper column 
according to the following system: 

Column —3 (minus three)=I do, or I feel, or I 
think this thing very much less often (or in- 
tensely) than the average. 

Column —2 (minus two)=I do, or I feel, or I 
think this thing less often (or intensely) than the 
average. 

Column —1 (minus one) = average, but on the low 
side. 

Column +1 (plus one)= average, but on the high 
side. 

Column +2 (plus two)=I do, or feel, or I think 
this thing more often (or intensely) than the 
average. 

Column +3 (plus three)=I do, or I feel, or I 
think this thing very much more often (or in- 
tensely) than the average. 
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The filling in of this questionnaire was 
then taken as the act to be predicted, the act 
whose predictability is used as a measure of 
the judge’s insight. The object of asking the 
subjects to rate themselves was to provide the 
judges with an act to be predicted—an act 
having the necessary properties of expressive- 
ness, limited alternatives, etc., that would 
satisfy the criteria stated above for a good 
measure of insight. The instruction given to 
judges was as follows: “(Name) has also 
filled in this form about himself. You are 
now asked to guess what he has said about 
himself, and to fill it in below.” (Rating 
instructions were then repeated.) “Now fill 
in the answers as you think (name) has filled 
them in.” The rating scale was then repeated 
below. 

This procedure is superficially similar to 
asking the judge to rate the subject on these 
traits; but the results are different in signifi- 
cant respects. To use one rating to validate 
another is to set oneself an essentially in- 
soluble problem. We may have reason to 
suppose that one rating is somewhat better 
than another because it is based on more 
knowledge, has been considered more care- 
fully, etc. But the indeterminacy of the prob- 
lem follows from the fact that we can never 
establish osie rating as the correct one because 
the trait to be rated has no single and un- 
equivocal meaning. On the other hand, 
the act of self-rating is a perfectly definite 
act, involving an unambiguous choice from 
a limited number of alternatives, and an 
attempt to predict this act has a precisely 
measurable error. Such predictions can thus 
serve as an instrument for analysing the 
nature of psychological insight, for measur- 
ing its variation under varying conditions. 
A prediction test is a better analytical instru- 
ment than a matching test because with the 
former each item can be evaluated separately, 
and its difficulty does not depend (as with a 
matching test) on the total series to which 
it belongs. 

The value of predictions in proving the 
validity of insight depends on the degree to 
which they are better than chance, i.e., better 
than those that could be made without any 
knowledge of the person whose self-rating 
is being predicted. It is thus important to 
have a method of calculating the degree of 


success that would, on the average, be ob- 
tained by chance. This is obtained as follows: 

The original ratings are scored on a six- 
point scale, from —3 to +3, omitting the 
central value o. Following Murray’s prac- 
tice, we re-score as follows: 


Original rating: —3 +1 +3 


+2 





Rescoring: te) 3 4 5 


This is more satisfactory for calculation, 
since it provides a scale of equal intervals. 
All subsequent references to self-ratings or 
predictions in this paper are to re-scored 
values. 

In the absence of special information, we 
assume all self-ratings to be equally probable. 
The chances of error are not, however, equal 
for all self-ratings. This is made plain in 
Table 1. The various possible self-ratings 


TABLE 1 


VaLues oF Errors OF PREDICTION IN 
RELATION To Se.r-RaTiNcs 


ABSOLUTE 


Prediction 


SecF-RaTING 0 3 4 5 MEAN Error 





2.50 
1.83 
1.50 
1.50 
1.83 
2.50 





are shown on the vertical scale; the various 
predictions on the horizontal scale; while in 
the cells are shown the absolute values of the 
errors of prediction. 

This table is, of course, purely a priori, 
based only on the rules of the test, and con- 
tains no empirical facts. We are now in a 
position to calculate the theoretical error of 
a series of predictions, to compare it with the 
actual error, and to calculate the significance 
of the difference. The method can best be 
explained by working an example. Table 2 
shows the calculation of the theoretical error 
for one subject. Subject 54a did the self- 
rating, and his ratings are shown in column: 
A; his wife (Subject 54d) predicted what 
his self-ratings would be, and her predictions 
are shown in Column B. Column C is the 








34 
absolute value of the difference between A 
and B and thus represents the wife’s error 
of prediction. Column T represents the theo- 
retical chance error (derived from Table 1). 
The excess of chance error over actual error 
is taken as the measure of the wife’s insight. 


B. Notcutr anpD 
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Tue ExperiMENT 
As an example of the possibilities of the 
predictive method, we applied it to the 
ancient problem of feminine intuition. We 
proposed to compare husbands’ understand- 
ing of their wives, with wives’ understand- 


TABLE 2 


EXAMPLE SHowING EvALuATION oF One Jupce’s Errors oF PREDICTION 


PREDICTION 
TEC 


SuI 


T 
CHANCE 
ERROR 


r 54D 





o= & bf 


ae 


wMiMi a NU O Hw 


50 
83 
50 
50 
83 


~NNeE NN ome ee Oe ee 


T)= 37 





“Insight” = Theoretical error 
can be calculated by Student's ¢-test. 


SUBJECTS 


wo. 


-20 -IS +10 -5 0 5 10 15 20 BS 30 


InsicHt Scores oF 64 HusBaNps AND THEIR 
Wives. M=10.2. N=1238. 
SEy=0.73. 


Fic. 1. 
o 8.25. 


minus actual error = 37 — 15 = 


over chance 


ing of their husbands. It was considered that 
husbands and wives have equal opportunities 
to know and study one another so that the 
ability to predice one another’s acts would 
be a valid measure of psychological insight. 
The self-rating scale described above was 
filled in by 64 married couples, all of whom 
subsequently attempted to predict their con- 
sort’s self-ratings. The subjects of the expezi- 
ment lived in the neighborhood of Durban 
and were all of European descent. Most of 
the testing was done by the junior author. 
Precautions were taken to prevent collusion. 

Figure 1 shows the insight scores of 64 
husbands and their wives. It will be noticed 
that 10 subjects (7 women and 3 men) made 
predictions that were inferior to what could 
have been obtained on the average by chance. 
The mean insight score, i.e., the superiority 
of the group over the average chance score, 
is 10.2 with a standard error of 0.73. It is 
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thus clear that a large proportion of the 
group are able to make genuinely successful 
predictions. 

The same data are shown in another way 
in Figure 2. Here, instead of showing the 
distribution of the insight scores of individual 
judges, we show the distribution of all errors 
of prediction, compared with the expected 
chance distribution. 


800 
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1G. 2. Prepiction Test. Errors of prediction of 
128 subjects, compared with theoretical distribu 
tion on the hypothesis. Continuous line 
is theoretical error. Broken line is actual error 
NV = 2304 judgments. 


null 


Table 3 gives the distribution of the differ- 
ences between the insight scores of husbands 
and wives, so arranged that differences ap- 
pear as positive when wives’ scores are greater 
than husbands’. Of the 64 couples for which 
insight scores were calculated, there were two 
in which both wife and husband showed in- 
sight inferior to chance. These two were 
omitted from Table 3. There were also six 
couples of which one member obtained a 
score inferior to chance. For the purpose of 
comparing husbands and wives, these six per- 
sons were considered to have insight scores 
of zero, and no negative scores were included. 
This seemed to be the most sensible proceed- 
ing, since it is probable that negative scores 
are due to lack of insight or inability to 
understand the procedures. It will be seen 
that in 37 couples the husband showed 
superior insight, and in 25 couples the wife 
showed superior insight. The mean differ- 
ence is not statistically reliable. So far as our 
evidence goes, intuition is not specifically 
feminine. 
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The data we have collected can also be used 
to supply evidence on various other questions. 
One of these concerns the relation between 
the accuracy of judgment and the similarity 
of the judge to the subject whom he is judg- 
ing. It has often been said, e.g., Allport (2, 
p. 513), that people are better judges of those 
like themselves. Murray (15, p. 275-6) comes 
to the same conclusion. The theory that one 
should set a thief to catch a thief is an 


TABLE 3 
INTUITION Scores OF 62 WIvES 


(Wife’s insight score minus husband's insight score) 


Score FREQUENCY 


—Ig to “15 
—-14.9 to —-I10 
9.9 to-—5 
—4.9 to o 
4-9 
9.9 
14.9 


0 to 
5 to 
10 to 


example of this view. We cannot test the 
validity of the notion because we have no 
knowledge of the “real” qualities of our sub- 
jects, but we can compare the accuracy of the 
predictions made, with the difference be- 
tween the self-ratings of husband and wife. 
We analysed the data, not in terms of the 
128 persons, but of the 2,304 judgments that 
they made. This problem is similar to that 
studied by Winslow (27), with similar results. 

The results of our analysis are shown in 
Figure 3. Here the judgments of husbands 
and wives are combined into a single table, 
since both show the same tendencies. The 
results are quite definite. The greater the 
difference between self-ratings on any one 
item, the greater is the error of prediction. 
This tends to confirm the view that we judge 
others by analogy with ourselves, and the less 
valid the analogy, the less accurate is the 
judgment. 

We attempted also to test another hypothe- 
sis of Allport’s, that central or dominating 
traits are more accurately rated than other 
traits (2, p. 301). In order to see if our data 
fitted this hypothesis, we calculated the mean 
error of prediction in relation to the self- 
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ratings; our supposition was that the extreme 
self-ratings, representing traits that were 
believed to exist in a marked degree, might 
be more accurately predicted. Our data do 
not confirm the theory. The trend rather 
follows that of the theoretical chance error. 


0 ! 2 3 a 5 


DIFFERENCE BETWEEN SELF-RATINGS 
OF HUSBANCS AND WIVES 


2.5 


-0 


MEAN ERROR OF JUDGMENT 


OF JUDGMENT, SHOWN IN RELA- 
DIFFERENCE BETWEEN HUSBANDS AND 
Secr-Ratincs. Subjects number 128; 
and judgments, 2304. 


Fic. 3. MEAN ERROR 
TION TO 
WIVES IN 


self-ratings 


Figure 4 shows the actual and theoretical 
errors of prediction for the more extreme and 
the more central self-ratings. 

Subjects vary greatly in their tendency to 
favour central or extreme ratings. There 
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error 
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error 
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Fic. 4. Actuat AND THeEoreTicAL Errors oF Pre 
DICTION IN Retation to Various Sevr-RaTiInes 
theoretical error; dotted line is 


Continuous line is 


actual error 


were some who used no 0 or 5 ratings and a 
few who used nothing else. A curious and 
quite unexplained feature of the self-ratings 
was the tendency for husbands and wives to 
make a similar use of extreme self-ratings. 
To test this, we correlated the frequency of 
extreme ratings in husbands and wives and 
found r 0.45 + 0.096. 


APPLICATIONS 


Although a number of experiments have 
previously been performed in which predic- 
tions have been used to measure insight, 
there does not appear to have been full appre- 
ciation of the possibilities of the method. 
Many other acts besides the filling in of a 
self-rating scale can be used for predictions. 
For example, we have considered developing 
for the same purpose a sentiments test—a 
list of opinions or attitudes with which the 
subject is asked to agree or disagree. A level 
of aspiration test is another whose result 
might be predicted by judges, though here 
some expert knowledge would be required 
of the judges. 

Such a measure of insight can be put to a 
great variety of uses. It can be used, for 
instance, to measure the relative amounts of 
insight possessed by various groups, and into 
various groups. It can be used also to meas- 
ure the effectiveness of various means of 
obtaining insight. For instance, a group of 
experts on TAT analysis could base predic- 
tions of answers to a self-rating scale on their 
analysis of TAT protocols, while another 
group of Rorschach experts might do the 
same with their chosen instrument. The 
relative effectiveness of TAT and Rorschach 
could then be objectively assessed by the 
validity of predictions based on them. 

The psychological skill of individuals could 
be similarly assessed. A certain proficiency 
might be required of those majoring in clini- 
cal psychology. And one can even imagine 
national competitions, batting averages pub- 
lished quarterly in the American Psycholo- 
gist, etc. For the development of such 
measures, it would be necessary that a few 
predictive tests should be adopted for general 
use, so that the results of different scales can 
easily be compared. The repeated use of a 
few standard predictive tests may prove to be 
an important step in developing objective 
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psychology. It may also initiate a type of 
quantitative measurement which is not de- 
pendent on the establishment of norms, with 
all the expense and fallacy that are com- 
monly involved in the latter process. In a 
predictive test the measurement obtained is 
a direct property of the predictor and is not 
relative to the norm of a group, as most 
current psychological measurements are. The 
predictive test can thus claim to be Galilean 
rather than Aristotelian. 


SUMMARY 


The principles of the predictive method 
for validating insights into personality are 
discussed. An experiment is described in 
which 64 married couples were given a self- 
rating scale and were then asked to predict 
one another’s self-rating. Predictions were 
significantly superior to chance. Successes 
were greater on items where subjects rated 
themselves similarly. There was no signifi- 
cant difference between the insight of hus- 
bands into wives, and of wives into husbands. 
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SOME EFFECTS OF CERTAIN COMMUNICATION PATTERNS 
ON GROUP PERFORMANCE’ 
BY HAROLD J. LEAVITT 


Nejelski and Company 


INTRODUCTION 


OOPERATIVE action by a group of indi- 

viduals having a common objective 

requires, as a necessary condition, a 
certain minimum of communication. This 
does not mean that all the individuals must 
be able to communicate with one another. 
It is enough, in some cases, if they are each 
touched by some part of a network of com- 
munication which also touches each of the 
others at some point. The ways in which the 
members of a group may be linked together 
by such a network of communication are 
numerous; very possibly only a few of the 
many ways have any usefulness in terms of 
effective performance. Which of all feasible 
patterns are “good” patterns from this point 
of view? Will different patterns give dif- 
ferent results in the performance of group 
tasks? 

In a free group, the kind of network that 
evolves may be determined by a multitude of 
variables. The job to be done by the group 
may be a determinant, or the particular abili- 
ties or social ranks of the group members, or 
other cultural factors may be involved. 

Even in a group in which some parent 
organization defines the network of com- 
munication, as in most military or industrial 
situations, the networks themselves may dif- 
fer along a variety of dimensions. There 
may be differences in number of connections, 
in the symmetry of the pattern of connec- 
hannel capacity” (how much and 
in many 
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are imposed on group members by various 
communication patterns, and the effects of 
these conditions on the organization and the 
behavior of its members. We tried to do 
this for small groups of a constant size, 
using two-way written communication and 
a task that required the simple collection of 
information. 


Some Characteristics of Communication 
Structures 

The stimulus for this research lies _pri- 
marily in the work of Bavelas (1), who con- 
sidered the problem of defining some of the 
dimensions of group structures. In his study, 
the structures analyzed consist of cells con- 
nected to one another. If we make persons 
analogous “cells” and communication 
channels analogous to “connections,” we find 
that some of the dimensions that Bavelas 
defines are directly applicable to the descrip- 
tion of communication patterns. Thus, one 
way in which communication patterns vary 
can be described by the sum of the neighbors 
that each individual member has, neighbors 
being defined as individuals to whom a mem- 
ber has communicative access. So, too, the 
concept of centrality, as defined by Bavelas, is 
of value in describing differences within and 
between structures. The most central position 
in a pattern is the position closest to all other 
positions. Distance is measured by number 
of communicative links which must be 
utilized to get, by the shortest route, from 
one position to another. 

Bavelas also introduced a sum of neighbors 
measure—sum of neighbors being a sum- 
mation, for the entire pattern, of the number 
of positions one link away from each posi- 
tion. Similarly, sam of distances is the sum- 
mation, for all positions, of the shortest 
distances (in links) from every position to 
every other one. 

Unfortunately, these dimensions we have 
mentioned do not in themselves uniquely 
define a pattern of communication. What 
defines a pattern is the way the cells are con- 
nected, regardless of how they are repre- 


to 
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sented on paper. In essence, our criterion is 
this: if two patterns cannot be “bent” into 
the same shape without breaking a link, they 
are different patterns. A more precise defi- 
nition of unique patterns would require the 
use of complex topological concepts. 


c 





Fic. 1. CoMMUNiCATION Patrerns (See text) 


Some Operational Characteristics of Com- 
munication Patterns 


Consider the pattern depicted as A in 
Figure 1. If at each dot or cell (lettered a, 
b, etc.) we place a person; if each link (line 
between dots) represents a two-way channel 
for written communications; and if we 
assign to the five participants a task requir- 
ing that every member get an answer to a 
problem which can be solved only by pooling 
segments of information originally held 
separately by each member, then it is possible 
a priori to consider the ways in which the 
problem can be solved. 

Pattern Flexibility. First we note that the 
subjects (Ss) need not always use all the 
channels potentially available to them in 
order to reach an adequate solution of the 
problem. Although pattern A (Fig. 1) con- 
tains potentially seven links or channels of 
communication, it can be solved as follows 
with three of the seven channels ignored. 


Step 1: a@ and e each send their separate items of 
information to 4 and d respectively. 

Step 2: 6 and d each send their separate items of 
information, along with those from a and 4 
respectively, to c. 


3: ¢ organizes all the items of information, 
arrives at an answer, and sends the answer to } 
and then to d. 

Step 4: 6 and d then send the answer to a and e 
respectively. 


Step 3: 


The use of these particular four channels 
yields pattern C (Fig. 1). The original 
seven-link pattern (A) can be used as a 
four-link pattern in various ways. For 
instance, each of the four Ss diagrammati- 
cally labelled c, 5, a, and e might send his 
item of information to d who would organize 
the items, arrive at the answer, and send it 
back to each respectively. Use of these par- 
ticular four channels would yield the pattern 
B in Figure 1. The problem could also be 
solved by the Ss using five, six, or all of the 
seven potential channels. 

Operational Flexibility. Secondly, with the 
specification that a given number of links be 
used, any pattern can be operated in a variety 
of ways. Thus the pattern D (Fig. 1), which 
has no pattern flexibility, can be used as 
shown in D-1, with information funnelled 
in to C and the answer sent out from C. It 
is also possible to use it, as in D-2, with E 
as the key position; or as in D-3. These are 
operational differences that can be character- 
ized in terms of the roles taken by the various 
positions. Thus in D-1, C is the decision- 
making position. In D-2, it is E or A. 
Some patterns can be operated with two or 
three decision-makers. 


The Definition of Maximum Theoretical 
Efficiency 

Before going further it may be helpful to 
state the task used in this research. To each 
S, labeled by color (see Fig. 2), was given a 
card on which there appeared a set of five 
(out of six possible) symbols. Each S’s card 
was different from all the others in that the 
symbol lacking, the sixth one, was a different 
symbol in each case. 

Thus, in any set of five cards there was 
only one symbol in common. The problem 
was for every member to find the common 
symbol. To accomplish this each member 
was allowed to communicate, by means of 
written messages, with those other members 
of the group to whom he had an open chan- 
nel (a link in our diagrams). Every separate 
written communication from one S (A) to 
another (B) was considered one message. 
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An S who had discovered the answer was 
allowed to pass the answer along. 

Minimum Number of Communications. 
For any pattern of n Ss, the minimum num- 
ber of communications, C, is given by 
C=2(n-1). 
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Fic. 2. SymsBor DistRisuTION BY TRIAL 


Theoretically, then, with number of mes- 
sages as the sole criterion, any pattern of 
n Ss is as efficient as any other n-sized 
pattern. 

The Minimum Time Required for Solu- 
tion. If we assume “standard” S’s, all of 
whom work, think, and write at the same 
speed, it is possible to calculate the limit set 
by the communication pattern on the speed 
with which the problem can be solved. 
Toward this end, we can arbitrarily define 
a time unit as the time required to complete 
any message, from its inception by any S to 
its reception by any other. 

For any not a power of 2 and with 
unrestricted linkage, when 2°<n<2*** and 
x is a power of 2, x+1 equals the minimum 


possible time units for solution of the prob- 
lem. Thus, for a five-man group we have 
27<5<27t? becoming 27<5<2*, and x+1=3 
time units. No five-man pattern can be done 
in less than three time units, although several 
require more than three time units. When 
n is an even power of 2, the formula 2*=n 
holds, and x=minimum time.” 

It will be noted that, although some pat- 
terns require fewer time units than others, 
they may also require more message (m) 
units. This phenomenon, effectively the 
generalization that it requires increased mes- 
sages to save time units, holds for all the 
patterns we have examined. It is, however, 
true that certain patterns requiring different 
times can be solved in the same number of 
message units. 


Some Possible Effects of Various Patterns on 
the Performance of Individuals 

There are two general kinds of reasons 
which dictéte against our theoretically per- 
fect performance from real people. The first 
of these is the obvious one that people are 
not standardized. There are also the forces 
set up by the patterns themselves to be con- 
sidered. The problem becomes one of ana- 
lyzing the forces operating on an individual 
in any particular position in a communi- 
cation pattern and then predicting how the 
effects of these forces will be translated into 
behavior. 

It is our belief that the primary source of 
differential forces will be centrality. Cen- 
trality will be the chief (though perhaps not 
the sole) determinant of behavioral differ- 
ences because centrality reflects the extent to 
which one position is strategically located 
relative to other positions in the pattern. 

Our selection of centrality derives from the 
belief that availability of information neces- 
sary for the solution of the problem will be of 
prime importance in affecting one’s behavior. 
Centrality is a measure of one’s closeness to 

2 This is an empirical generalization derived chiefly 
from an analysis of a four-man square pattern. In such 
a pattern, A and B, and C and D may swap information 
in one time unit. Then A and C, and B and D may 
swap in two time units to yield a complete solution. 
For an cight-man ladder pattern the same simultaneous 
swapping process yields a minimum time. For the in- 
tervening n’s, at least “part” of a time unit is required, 
in addition to the minimum time for the four-man 


pattern. A detailed account of this analysis may be 
found in a paper, as yet unpublished, by J. P. Macy, Jr. 
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all other group members and, hence, is a 
measure of the availability of the information 
necessary for solving the problem. 

Availability of information should affect 
behavior, in turn, by determining one’s role 
in the group. An individual who can rapidly 
collect information should see himself and 
be seen by others in a different way from an 
individual to whom vital information is not 
accessible. Such roles should be different in 
the extent to which they permit independ- 
ence of action, in the responsibility they 
entail, and in the monotony they impose. 
Finally, differences in independence, in re- 
sponsibility, and in monotony should affect 
the speed, the accuracy, the aggressiveness, 
and the flexibility of behavior. 


or 
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METHOD 


The Problem to be Solved 

We have already described the task to be given 
our Ss—a task of discovering the single common 
symbol from among several symbols. When ail 
five men indicated that they knew the common 
symbol, a trial was ended. Another set of cards, 
with another common symbol, was then given to 
the Ss, and another trial was begun. 

Each group of Ss was given 15 consecutive trials. 
The composition of the standard sets of cards, used 
for all groups, is indicated in Figure 2, which indi- 
cates the symbol not on each person’s card for 
each trial. By referring this missing symbol to the 
set of six symbols at the top, the reader may recon- 
struct the symbols actually on each man’s card. 
The common symbol (the right answer) is also 
shown in Figure 2. 


The Apparatus 

The Ss table 
(Fig. 3) so that each was separated from the next 
by a vertical partition from the center to six inches 
beyond the table’s edge. 


were seated around a circular 


The partitions had slots 


permitting subjects to push written message cards 
to the men on either side of them. 

To allow for communication to the other men 
in the group, a five-layered pentagonal box was 
built and placed at the center of the table. The 
box was placed so that the partitions just touched 
each of the five points of the pentagon. Each of 
the five resulting wedge-shaped work-spaces was 
then painted a different color. The Ss were sup- 
plied with blank message cards whose colors 
matched that of their work spaces. Any message 
sent from a booth had to be on a card of the 
booth’s color. On the left wall of each partition, 
16 large symbol cards, representing 16 trials, were 
hung in loose-leaf fashion. The cards were placed 
in order with numbered backs to S. At the start- 
ing signal, S could pull down the first card and go 
to work. 

In addition, each work space was provided with 
a board on which were mounted six switches. 
Above each switch appeared one of the six symbols. 
When S got an answer to the problem, he was to 
throw the proper switch, which would turn on an 
appropriate light on a master board of 30 lights in 
the observer's room. When five lights (whether 
or not they were under the correct symbol), repre- 
senting five different Ss, were lit, the observer 
called a halt to the trial. The observer could tell 
by a glance at the light panel whether (a) five 
different Ss had thrown their switches, (b) whether 
all five had decided on the same answer, and (c) 
whether the answer decided on was right or wrong. 
The same detailed instructions were given to all Ss. 

A preliminary series of four problems, in which 
each S was given all the information required for 
solution, was used. This was done to note the 
extent of differences among Ss in the time required 
to solve such problems. 


The Procedure 

One hundred male undergraduates of M.I.T.,® 
drawn from various classes at the Institute, served 
as Ss for these experiments. These 100 were split 
up into 20 groups of five men each. These 20 
groups were then further subdivided so that five 
groups could be tested on each of four experi- 
mental patterns. 

Each group was given 15 consecutive trials on 
one pattern, a process which required one session 
of about fifty minutes. These Ss were not used 
again. The order in which we used our patterns 
was also randomized. Just in case the color or 
geographical position of one’s work-space might 
affect one’s behavior, we shifted positions for each 


3 Data on female graduate students are being gathered 
at M.I.T. by Smith and Bavelas, and the indications are 
that their behavior differs in some ways from the 
behavior of our male Ss. 
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new group. After a group had completed its 15 
trials, and before members were permitted to talk 
with one another, each member was asked to fill 
out a questionnaire. 
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Fic. 4. THe Experimental. PATTeRNs 
The Patterns Selected 

The 
research are shown in Figure 4. 

These four patterns represented extremes in 
centrality (as in the circle vs. the wheel), as well 
as considerable differences in characteristics 


(Table 1). 


four five-man patterns selected for this 


other 


served only as a transmitter of information 
and of answers. In at least one case, C trans- 
mitted answers first to A and B and only 
then to D. Organization for the Y evolved 
a little more slowly than for the wheel, but, 
once achieved, it was just as stable. 

In the chain information was usually fun- 
nelled in from both ends to C, whence the 
answer was sent out in both directions. 
There were several cases, however, in which 
B or D reached an answer decision and 
passed it to C. The organization was slower 
in emerging than the Y’s or the wheel's, but 
consistent once reached. 

The circle showed no opera- 
tional organization. Most commonly mes- 
sages were just sent in both directions until 
any S received an answer or worked one out. 


consistent 


TABLE 1 


CHARACTERISTICS OF THE EXPERIMENTAL PATTERNS 


Most 
CENTRAL 


PATTERN PosITION 


SuM OF 
NEIGHBORS 


Min. 
MESSAGES 


SUM OF 
DIsTANCES 





CHAIN C(6.7) 

Y C(7.2) 
WHEEL C(8.c) 
CIRCLE Al!(5.0) 


40 8(5¢) 
36 8(4t) 
32 8(52) 
30 3(14m) 8(5¢) 





RESULTS 


The data which have been accumulated are 
broken down in the pages that follow into 
(a) a comparison of total patterns and (b) a 
comparison of positions within patterns. 


A. Differences among Patterns 


It was possible to reconstruct a picture of 
the operational methods actually used by 
means of: (a) direct observations, (b) post- 
experimental analysis of messages, and (c) 
post-experimental talks with Ss. 


The wheel operated in the 


all five cases. The peripheral men funnelled 
information to the center where an answer 
decision was made and the answer sent out. 
This organization had usually evolved by the 
fourth or fifth trial and remained in use 
throughout. 

The Y operated so as to give the most cen- 
tral position, C (see Fig. 4 and Table 1), 
complete decision-making authority. The 
next-most-central position, D (see Fig. 4), 


same way in 


In every case, all available links were used at 
some time during the course of each trial. 











MepIAN Group-TIMEs PER TRIAI 


Direct Measures of Differences 


Patterns 


among 


Time. The curves in Figure 5 are for 
correct trials only, that is, for trials in which 
all five switches represented the correct com- 
mon symbols. In most cases, the medians 
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shown are for distributions of five groups, 
but iz no case do they represent less than 
three groups. 

The variability of the distributions repre- 
sented by these medians is considerable. In 
the fifteenth trial, the distribution for the 
circle has a range of 50-96 seconds; for the 
chain, 28-220 seconds; for the Y, 24-52 
seconds; and for the wheel, 21-46 seconds. 
Moreover, much of the time that went to 
make up each trial was a constant consisting 
of writing and passing time. Any differ- 
ences attributable to pattern would be a 
small fraction of this large constant and 
would be easily obscured by accidents of mis- 
placing or dropping of messages. 

Despite all these factors, one measure of 
speed did give statistically significant differ- 
ences. A measure of the fastest single trial 
of each group indicates that the wheel was 
considerably faster (at its fastest) than the 


circle (Table 2). 


TABLE 2 


Fastest SincLeE Correct TRIAL 


Y Wueer Drier. 


p* 


CIRCLE 





Ci-W 
Ch-W 
Ci-Y 
Ch-Y 


MEAN 
MEDIAN 
RANGE 


<.01 
<.10 
<..05 
<.20 


50.4 
55.0 
44-59 





* Significance of differences between means were measured 
throughout by ¢-tests. The p-valves are based on distributions of 
t which include both tails of the distribution (see Free:aan [2]). 
Where differences are between proportions, p is derived from the 
usual measure of significance of differences between proportions. 


Ci-W means the circle-wheel difference, and so on 


Messages. The medians in Figure 6 repre- 
sent a count of the number of messages sent 
by each group during a given (correct) trial. 
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It seems clear that the circle pattern used 
more messages to solve the problem than the 
others. 

Errors. An error was defined as the throw- 
ing of any incorrect switch by an S during 
a trial. Errors that were not corrected before 
the end of a trial are labelled “final errors”; 
the others are referred to as “corrected 
errors.” 

It should be pointed out that the error 
figures for the wheel in Table 3 are distorted 
by the peculiar behavior of one of the five 
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Fic. 6. Mepian Messaces PER TRIAL 
wheel groups. The center man in this group 
took the messages which he received to be 
answers rather than simple information, and, 
in addition to throwing his own switch, 
passed the information on as an answer. 
This difficulty was cleared up after a few 
trials, and the figures for the last eight trials 
are probably more representative than the 
figures for the full 15 trials. 

In addition to the differences in errors, 
there are differences in the proportion of 
total errors that were corrected. Although 


TABLE 3 


ERRORS 


Tora Errors 
(15 TRIALS) 


CIRcLeE 

CHAIN 
Y 

WHre. 


Tora Errors 
(Last 8 Trias) 


MEAN 
No. oF TRIALS 
WITH aT Least 
One Fina Error 


Fina Errors 


AN RANGE 





2-14 
I-19 





p Varues Ci-Y <0 
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more errors were made in the circle pattern 
than any other, a greater proportion of them 
(61 per cent) were corrected than in any 
other pattern. Too, the frequency of unani- 
mous five-man final errors is lower, both 
absolutely and percentage-wise, for the circle 
than for the chain. 


Questionnaire Results 
1. “Did your group have a leader? If so, who?” 
Only 13 of 25 people who worked in the circle 
named a leader, and those named were scattered 
among all the positions in the circle. For all pat- 
terns, the total frequency of people named increased 
in the order circle, chain, Y, Similarly, the 


unanimity of opinion increased in the same order 


wheel. 


so that, for the wheel pattern, all 23 members who 
recognized any leader agreed that position C was 
that leader 

2. “Describe briefly the organization of your 
group 


The word was 


organization” in this 
Some of the Ss understood the word 


others 


question 


ambiguous 


to mean communication, while 


equated it with their own dutes or with status 


pattern of 


difference. 
differences in interpretation were not 
Sixteen people in the wheel 


These 
random, however. 
groups fully reproduced the wheel structure in 
answer to this question, while only one circle mem 
ber reproduced the circle pattern. 

3. “How did you like your job in the group?” 

In this question Ss were asked to place a check 
on a rating scale marked “disliked it” at one end 
other. For purposes of 


and “liked it” at the 


translated into numerical 
dislike 


rating was estimated only to the closest decile. 


Again, we find the order circle, chain, Y, wheel, 


scale was 


analysis, the 


scores from o at the end to 100. Each 


with circle members enjoying their jobs significantly 
nore than the wheel members. 
4. “See you can recall how you felt about the 
went along. Draw the curve below.” 
asked to sketch a curve 
We measured the height of 
1 


S$ ON a SIX-pC int scale at 


Ss wer into a 
ovided for it. 
trials 1, 5, 10, 
heights were averaged for each 
erages of the group averages were 
differences between groups are not 
icant, trends of increasing satisfa: 

reasing satisfaction in the 

> the findings in the ques- 

one’s job. Except for a 
the order is, as usual, 


at any time, that kept 


your group from performing at its best? If so, 
what?” 

The answers to this question were categorized as 
far as possible into several classes. 

None of the circle members feels that “nothing” 
was wrong with his group; a fact that is suggestive 
of an attitude different from that held by members 
of the other patterns. So, too, is the finding that 
insufhcient knowledge of the pattern does not 
appear as an obstacle to the circle member but is 
mentioned at least five times in each of the other 
patterns. 

6. “Do you think your group could improve its 
efficiency? If so, how?” 

Circle members place great emphasis on organiz- 
ing their groups, on working out a “system” (men- 
tioned 17 times). Members of the other patterns, 
if they felt that any improvement at all was pos- 
sible, emphasized a great variety of possibilities. 

7. “Rate your group on the scale below.” 

For purposes of analysis, these ratings (along a 
straight line) were transposed into numbers from o, 
for “poor,” to 100. 

The same progression of differences that we have 
already encountered, the progression circle, chain, 
Y, wheel, holds for this question. Once again the 
circle group thinks less well of itself (Mean=56) 
than do the other patterns (M.a=60; M,=70; 
M==91). 


Message Analysis 


The messages sent by all Ss were collected 
at the end of each experimental run and their 


contents coded and categorized. Some of 
these categories overlapped with others, and 
hence some messages were counted in more 
than one category. 

The now familiar progression, circle, chain, 
Y, wheel, continues into this area. Circle 
members many more informational 
messages than membeis of the other patterns 
(M .,=283; M,,=101). Circle members also 
send more answers (M,—91; M,,=65). 

The same tendency remains in proportion 
to total errors as well as absolutely. The 
circle has a mean of 4.8 recognition-of-error 
messages for a mean of 16.6 errors;‘the chain 
has a mean of 1 recognition-of-error mes- 
sages for a mean of 98 errors. 

We were concerned, before beginning these 
experiments, lest Ss find short cuts for solving 
the problem, thus making certain compari- 
sons among patterns difficult. One such 
short cut we have called “elimination.” 
Instead of taking time to write their five 


send 
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symbols, many Ss, after discovering that only 
six symbols existed in all, wrote just the 
missing symbol, thus saving considerable 
time. This method was used by at least one 
member in two of the circle groups, in all the 
chain groups, in three of the Y groups, and 
in four of the wheel groups. In doth the 
circle cases, the method was used by all five 
members during final trials. In the chain, 
though present in every group, elimination 
was used only once by all five members, twice 
by three members, and twice by just one 
member. In the Y, the method was adopted 
once by four members (the fifth man was 
not the center) and twice by two members. 


as if all positions in each pattern were 
actually different from one another. 


Direct Observations 

Messages. The most central positions, it 
will be seen from Table 4, send the greatest 
number of messages; the least central ones 
send the fewest. 

Errors. The analysis of total errors 
made in each position showed nothing of 
significance. 

Ouestionnaire Results by Position 
1. “How much did you enjoy your job?” 


The most central positions in other patterns enjoy 
their jobs more than any circle position. Peripheral 


TABLE 4 


NuMser oF Messaces SENT BY Each PosiTION 


B Cc 





MEAN 
RANGE 


83.6 
60-98 


90.0 


63-102 


MEAN 
RANGE 


70.8 
43-112 


82. 
CHAIN 4 
45-113 
79.8 

65-104 


MEAN 23.8 
RANGt 21-28 


MEAN 
RANGE 


102.8 


WHEEL 78-138 


There was at least one case (in the wheel) 
in which a member who suggested the use 
of elimination was ordered by another mem- 
ber not to use it. 

The questions raised here are two. Is the 
idea of elimination more likely to occur in 
some patterns than in others? Is an innova- 
tion like elimination likely to be more 
readily accepted in some patterns than in 
others? To neither of these questions do we 
have an adequate answer. 


B. A Positional Analysis of the Data 


Observation of the experimental patterns 
indicates that every position in the circle is 
indistinguishable from every other one. No 
one has more neighbors, is more central, or 
is closer to anyone than anyone else. In the 
wheel, the four peripheral positions are alike, 
and so on. Despite our inability to differ- 
entiate these positions from one another, we 
have set up the data in the following sections 


positions, on the other hand, enjoy the job less than 
any circle position (Table 5). 

2. “See if you can recall how you felt about the 
job as you went along. Draw the curve below.” 

The data for this question are gathered after all 
most-peripheral and all most-central positions are 
combined. Peripheral positions were: positions A 
and E, in the chain; position E in the Y; and 
positions A, B, D, and E in the wheel. Central 
positions were all C positions with the exception 
of C in the circle. The data thus combined high- 
light the trend toward higher satisfaction with 
increasing centrality. The central positions pro- 
gress from a mean of 2.1 at trial 1 to a mean of 
3.9 at trial 15. Peripheral positions decline from 
3.9 to 2.3. 


Message Analysis by Position 

One of the things that immediately stands 
out from an examination of the messages is 
an apparent peculiarity in the informational 
Although the most cen- 
tral man in the chain sends more informa- 


message category. 
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tional messages (52) than the other positions 
in that pattern, the same is not true of the 
most central men in the Y and the wheel. 
In the Y, it is position D, the next-most- 
central position, that sends most; while in 
the wheel all positions are about equal. This 
peculiarity becomes quite understandable if 
we take into account (a) the kind of organi- 
used in each pattern and (b) the fact 
figures represent the entire 15 
which occurred the 
it itself stably organized. In the 

Y, and the chain, the center man 


' 


Zation 
that these 


some ot before 


» send no informational mes- 


less active, has a distinct leader, is well and 
stably organized, is less erratic, and yet is 
unsatisfying to most of its members. 

There are two questions raised by these 
behavioral differences. First, what was 
wrong with our a priori time-unit analysis? 
The results measured in clock time do not 
at all match the time-unit figures. And 
second, to what extent are behavioral differ- 
ences matched by centrality differences? 


The Time Unit 
It was hypothesized earlier that the time 
taken to solve a problem should be limited 


TABLE 5 


ENJOYMENT OF THE Jos 


| 
-) 


C-E 
C-AE 
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C-A 
C-AB 
D-E 
B-C 
C-E 
ABED-C 


AAA AAA AA A 





sages, only answers; but in the early trials, 
before his role was clarified, he apparently 
sent enough to bring his total up to or higher 
than the level of the rest. 

It can also be noted that the number of 
organizational (messages which 
seek to establish some plan of action for 
future trials) is negatively correlated with 
positional centrality. The most peripheral 
men send the greatest numbers of organiza- 
tional messages, the most central men least. 


me. s$ages 


DiscussIon 


Patternwise, the picture formed by the 
results is of differences almost always in the 
order circle, chain, Y, wheel. 

We may grossly characterize the kinds of 
differences that occur in this way: the circle, 
one extreme, is active, leaderless, unorgan- 
ized, erratic, and yet is enjoyed by its mem- 


bers. The wheel, at the other extreme, is 


at the lower end by the structure of the pat- 
tern of communication. If pattern does set 
such a limitation on speed, the limitation is 
not in the direction we would have predicted. 
Our analysis (Table 1), based on a theoreti- 
cal time unit, led us falsely to expect greatest 
speed from the circle pattern. 

There are three outstanding reasons for the 
failure of the time-unit analysis to predict 
clock time. First, the time unit, itself, was 
too gross a measure. We defined the time 
unit as the time required for the transmission 
of one message from its inception to its 
reception. In actuality, different kinds of 
messages required very different clock times 
for transmission. Ss could send two mes- 
sages simultaneously. They could also lay 
out and write several messages before send- 
ing any. 

A second reason for the failure of the time- 
unit analysis was the assumption that Ss 
would gravitate to the theoretically “best”- 
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operating organization. Only the wheel 
groups used the theoretically “best” method 
(the minimum time method) consistently. 

Finally, it should be pointed out that dif- 
ferences in speed between patterns were sub- 
ject to major fluctuations for reasons of 
differences in writing speed, dexterity in 
passing messages, and other extraneous 
factors. 


The Relation of the Centrality Measure to 
Behavior 

Our second and more important question 
is: Are the behavioral differences among 
patterns and among positions related con- 
sistently to the centrality index? An exam- 
ination of Table 1 indicates that the cen- 
trality index shows the same progression, 
circle, chain, Y, wheel, as do most of the 
behavioral differences. On a positional basis, 
centrality also differentiates members of a 
pattern in the same order that their be- 
havior does. 

Because such a relationship does exist be- 
tween behavior and centrality, a more de- 
tailed consideration of the centrality concept 
is in order. 

The central region of a structure is defined 
by Bavelas as “the class of all cells with the 
smallest p to be found in the structure.” 
The quantity, p, in turn, is defined as the 
largest distance between one cell and any 
other cell in the structure. Distance is meas- 
ured in link units. Thus the distance from 
A to B in the chain is one link; from A to C 
the distance is two links. The most central 
position in a pattern is the position that is 
closest to all other positions. Quantitatively, 
an index of the centrality of position A in 
any pattern can be found by (a) summing 
the shortest distances from each position to 
every other one and (b) dividing this sum- 
mation by the total of the shortest distances 
from position A to every other position. 

Centrality, then, is a function of the size 
of a pattern as well as of its structure. Thus, 
in a five-man circle, the centrality of each 
man is 5.0. In a six-man circle, the centrality 
of each man jumps to 6.0. The two most 
peripheral men in a five-man chain each 
have a centrality of 4.0. But in a seven-man 
chain, the two most peripheral men have 
centralities of 5.3. 


In Figure 7 are given the centralities of 
each position in each of our four test patterns. 
The sum of centralities is also given. Both 
total centrality and distribution of centralities 
fall in the order circle, chain, Y, wheel. 

These centrality figures correlate with the 
behavior we have observed. But it seems 
unreasonable to assume that the correlation 
would hold for larger n’s. Certainly we 
would not expect more message activity or 
more satisfaction from peripheral positions 
in a chain of a larger m than from a five-man 
chain. 

To obviate this difficulty, a measure we 
have called “relative peripherality” may be 
established. The relative peripherality of 
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Fic. 7. Centraity INpices (above) AND 


PERIPHERALITY INpicEs (below) 


any position in a pattern is the difference 
between the centrality of that position and 
the centrality of the most central position in 
that pattern. Thus, for the two end men in 
a five-man chain, the peripherality index is 
2.7 (the difference between their centralities 
of 4.0 and the centrality of the most central 
position, 6.7). For a total pattern, the 
peripherality index may be taken by sum- 
mating all the peripherality indices in the 
pattern (Fig. 7). 

Examination of the data will show that 
observed differences in behavior correlate 
positively with these peripherality measures. 
By total pattern, messages, satisfaction, and 
errors (except for the wheel) vary con- 
sistently with total peripherality index. 
Similarly, by position, messages and atis- 
faction vary with peripherality. Errors, 
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however, show no clear relationship with 
peripherality of position, a finding which is 
discussed in detail later in this section. 

Recognition of a leader also seems to be a 
function of peripherality, but in a somewhat 
different way. A review of our leadership 
findings will show that leadership becomes 
more clear-cut as the differences in periph- 
erality within a pattern become greater. 
Recognition of a leader seems to be deter- 
mined by the extent of the difference in cen- 
trality between the most central and next- 
most-central man. 

There arises next the question: What is the 
mechanism by which the peripherality of a 
pattern or a position affects the behavior of 
persons occupying that pattern or position? 

A reconstruction of the experimental situ- 
ation leads us to this analysis of the periph- 
erality-behavior relationship: 

First, let us assume standard Ss, motivated 
to try to solve our experimental problem as 
quickly as possible. Let them be “intelli- 
gent” Ss who do not send the same informa- 
tion more than once to any neighbor. Let 
them also be Ss who, given several neighbors, 
will send, with equal probability, their first 
message to any one of those neighbors. 

Given such standard Ss, certain specific 
positions will probably get an answer to the 
problem before other positions. In the chain, 
position C will be most likely to get the 
answer first, but, in the circle, all positions 
have an equal opportunity. 

To illustrate, consider the chain pattern 
(see Fig. 4): During time unit 1, A may send 
only to B. B may send either to C or to A. 
C may send either to B or to D. D may 
send either to C or to E. E may send only 
to D. No matter where B, C, and D send 
their messages, B and D will have, at the end 
of one time unit, A’s and E’s information. 
During the second time unit, if B and/or D 
had sent to C the first time, they will now 
send to A and E. If they sent to A and E 
the first time, they will send to C, and C 
will have the answer. Even if B and D do 
not send to C until the third time unit, C 
will either get the answer before or simul- 
taneously with B and D. In no case can 
any other position beat C to the answer. In 
the wheel, C cannot even be tied in getting 


an answer. He will always get it first. 


Our second concern is with Ss’ perceptions 
of these answer-getting potentials. We sug- 
gest that these random differences in answer- 
getting potentials rapidly structure members’ 
perceptions of their own roles in the group. 
These differences affect one’s independence 
from, or dependence on, the other members 
of the group. In the wheel, for example, a 
peripheral S perceives, at first, only that he 
gets the answer and information from C and 
can send only to C. C perceives that he gets 
information from everyone and must send 
the answer to everyone. The recognition of 
roles is easy. The peripheral men are de- 
pendent ,on C. C is autonomous and con- 
trols the organization. 

In the circle, an S’s perception must be 
very different. He gets information from 
both sides; sometimes he gets the answer, 
sometimes he sends it. He has two channels 
of communication. He is exclusively depend- 
ent on no one. His role is not clearly dif- 
ferent from anyone else’s. 

Thirdly, having closed the gap between 
structural pattern and Ss’ perceptions of their 
roles in the group, the problem reduces to 
one purely psychological. The question be- 
comes: How do differences in one’s percep- 
tion of one’s own dependence or independ- 
ence bring about specific behavior differences 
of the sort we have observed? 

Differences in satisfaction level are rela- 
tively easy to relate to independence. In our 
culture, in which needs for autonomy, recog- 
nition, and achievement are strong, it is to 
be expected that positions which limit inde- 
pendence of action (peripheral positions) 
would be unsatisfying. 

A fairly direct relationship between cen- 
trality (and, hence, independence) and the 
speed with which a group gets organized is 
also perceptible. In the wheel, unless Ss act 
“unintelligently,” an organization, with C 
as center, is forced on the wheel groups by 
the structural pattern. In the circle, no such 
differences in role and, hence, in organiza- 
tion are forced on the group. 

Message-activity can also be related to cen- 
trality by means of the independence-of- 
.ction concept. A peripheral person in any 


pattern can send messages to only one other 
Only one informational message 
s called for. Extra messages would be repe- 


position. 
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titious. Central positions, however, are free 
to send more than one non-repetitious infor- 
mational message until an organization 
evolves. Once the most central man per- 
ceives that he is most central, he need send 
no informational messages. But so long as 
the most central man does not perceive his 
own position, it is intelligent to send infor- 
mational messages to whomever he feels may 
require some information. It is in keeping 
with this analysis that the circle should yield 
maximum messages and the wheel minimum 
messages. 

If the behavior of one of the wheel groups 
can be discounted, then an explanation, in 
terms of peripherality, is also possible for 
both differences in tendencies to correct 
errors and total error differences. 

If peripherality determines one’s independ- 
ence of action, it seems very likely that posi- 
tions most limited in independence should 
begin to perceive themselves as subordinates 
whose sole function is to send information 
and await an answer. That they should then 
uncritically accept whatever answer they 
receive is perfectly in keeping with their 
subordinate, relatively unresponsible _posi- 
tions—hence, very little correction of errors 
in the patterns in which there are great dif- 
ferences in peripherality. 

Total errors, it will be recalled, were cor- 
related with total peripherality indices but 
showed no clear relationship with the relative 
peripherality of particular positions. A con- 
sideration of our definition of error may shed 
some light on this apparent anomaly. 

The “errors” that we recorded were signals 
from the S that indicated a wrong answer. 
But these wrong answers derived from a 
variety of sources. First, Ss might wrongly 
interpret the correct information they re- 
ceived. They might also make errors in 
throwing switches; and they might also cor- 
rectly interpret wrong information. In all 
three cases, “errors” were recorded. 

We submit that this broad definition of 
error should yield a total pattern relationship 
with peripherality, but no positional relation- 
ship. Our reasoning can be illustrated by an 
example. Suppose that the central man in 
the wheel wrongly interprets information 
sent to him and, hence, throws an incorrect 


switch. This is a “real” error. He then 


funnels out the wrong answer to the other 
members. At least three of these intelligently 
conclude that the answer sent them is correct 
and also throw the wrong switches. We then 
have three “false” errors consequent to our 
single “real” one. When several independ- 
ent answer decisions are made (as in the 
circle), we should expect several real errors, 
multiplication of these by a factor of about 3, 
and a larger total of errors. This process 
should lead to a correlation between total 
pattern behavior and peripherality but not 
to a correlation between positional behavior 
and peripherality. The process simply 
multiplies real errors more or less constantly 
for a whole pattern but obscures positional 
differences because the “real” and the “false” 
errors are indistinguishable in our data. 

We submit, further, that pattern differ- 
ences in real errors, if such there be, may be 
attributable to “over-information”; too much 
information to too many members which, 
under pressure, leads to errors. Central 
positions or positions which are no less cen- 
tral than others in the pattern should be the 
ones to yield the greatest number of real 
errors, while peripheral positions, which re- 
quire no such rapid collation of information, 
should be the false error sources. Such an 
hypothesis would be in keeping with our 
total pattern findings and might also clarify 
our positional findings. Only an experiment 
designed to differentiate real from false errors 
can answer this question. 

It is in keeping with this peripherality- 
independence analysis, also, that we should 
find the recognition of a single leader occur- 
ring most frequently in the wheel and Y 
groups. It is also to be expected that we 
should find circle members emphasizing need 
for organization and planning and seldom 
giving a complete picture of their pattern. 
Perhaps, too, it is reasonable to expect that 
the whole group should be considered good 
in the highly organized wheel (and not so 
good in the unorganized circle) even though 
one’s own job is considered poor. 

In summary, then, it is our feeling that 
centrality determines behavior by limiting 
independence of action, thus producing dif- 
ferences in activity, accuracy, satisfaction, 
leadership, recognition of pattern, and other 
behavioral characteristics. 
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SUMMARY AND CONCLUSIONS 


Within the limits set by the experi- 
mental conditions—group size, type of prob- 
lem, Ss—these conclusions seem 
warranted: 

1. The communication patterns within 
which our groups worked affected their be- 
havior. The major behavioral differences 
attributable to communication patterns were 
differences in accuracy, total activity, satis- 
faction of group members, emergence of a 
leader, and organization of the group. 
There may also be differences among _pat- 
terns in speed of problem solving, self-cor- 
recting tendencies, and durability of the 
group as a group. 

2. The positions which individuals occu- 
pied in a communication pattern affected 
their behavior while occupying those posi- 
tions. One’s position in the group affected 
leader of the 


source of 


the chances of becoming a 
group, one’s satisfaction with one’s job and 
with the group, the quantity of one’s activity, 
and the extent to which one contributed to 
the group’s functional organization. 

2. The characteristic of communication 
patterns that was most clearly correlated 
with behavioral differences was centrality. 
Total pattern differences in behavior seemed 
to be correlated with a measure of centrality 
we have labelled the peripherality index. 
Positional differences in behavior seemed to 


be correlated with the positional peripherality 
indices of the various positions within 
patterns. 

4. It is tentatively suggested that centrality 
affects behavior via the limits that centrality 
imposes upon independent action. Inde- 
pendence of action, relative to other mem- 
bers of the group is, in turn, held to be the 
primary determinant of the definition of who 
shall take the leadership role, total activity, 
satisfaction with one’s lot, and other specific 
behaviors. 

More precisely, it is felt that where cen- 
trality and, hence, independence are evenly 
distributed, there will be no leader, many 
errors, high activity, slow organization, and 
high satisfaction. | Whatever frustration 
occurs will occur as a result of the inadequacy 
of the group, not the inadequacy of the 
environment. 

Where one position is low in centrality 
relative to other members of the group, that 
position will be a follower position, depend- 
ent on the leader, accepting his dictates, fall- 
ing into a role that allows little opportunity 
for prestige, activity, or self-expression. 
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THE MODIFICATION OF STUTTERING THROUGH 
NON-REINFORCEMENT * 


BY JOSEPH G. SHEEHAN 


University of California, Los Angeles 


TUTTERING is of special interest to the 

psychologist in that it involves non- 

integrative or persistently maladaptive 
behavior. The challenge in stuttering behav- 
ior lies in the explanation of its reinforcement, 
the problem of why the behavior continues 
despite its unserviceability. That speech 
pathologists have been concerned with this 
problem and are aware of its crucial role in 
the understanding of stuttering, is apparent 
from the writings of Dunlap (4), Van 
Riper (14), and Johnson (8). However, the 
exact nature of reinforcement in stuttering 
has never been formulated and subjected to 
experimental test. 

In the past the problem of stuttering has 
been approached from a variety of disciplines, 
including the physiological, the neurological, 
the psychoanalytic, the personality, the 
semantic, and the developmental. Frequently, 


two or three approaches overlap within a 


single explanation. Different investigators 
have addressed themselves to different 
aspects of the problem, some to nature and 
cause, some to treatment, and some entirely 
to personality variables. Too often there has 
been scant relation between an authority's 
theoretical and therapeutic treatment of the 
problem. For example, many of the older 
workers in the field who ascribe stuttering to 
physiological causes are still advocating for 
its treatment phonetic drills and breathing 
exercises which bear no relation to the 
assumed physiological bases. 

Although nearly all authorities have ob- 
served the presence of a “habit pattern” in 
stuttering, many of them make no provision 
for dealing directly with it in their systems 
of treatment. Some seem to imply that 


1 From a doctoral dissertation submitted at the Uni- 
versity of Michigan in August, 1949. The author is 
indebted to Dr. Edward L. Walker for his guidance, 
and he is grateful to Dr. Wendell Johnson and Dr. C. 
Van Riper for making subjects available from their 
clinics and for stimulating counsel in the initial stages 
of the study. 


purely verbal procedures, e.g., nondirective 
counseling, will cause the habit pattern to 
disappear automatically. Even among those 
who do work on the habit pattern, few have 
made an effective application of scientific 
principles of learning. This lack has been 
characteristic of speech correction practices 
in general. 

It is with an awareness of this lack that 
the present study has attacked the problem 
of stuttering within the framework of mod- 
ern learning theory. That habit plays an 
important role in stuttering has, of course, 
long been recognized. It may be worth 
while to trace briefly the work of those who 
have treated stuttering within a learning 
framework and to examine their contribu- 
tions. They include references both to the 
origin of stuttering and to the growth of the 
stuttering pattern through a building-up of 
new habits. 

A generation ago the educational theory, 
which regarded stuttering as a bad habit 
arising out of the natural hesitancies of child- 
hood speech, enjoyed a wide following. 
Among those who have propounded this 
view are Stoddard (13) and Russell (11). 

Bluemel (2), who utilized Pavlovian con- 
cepts, considered speech to be a conditioned 
response and stuttering to result from the 
inhibition of this conditioned response 
through traumatic experiences. 

In applying his beta hypothesis, Dunlap 
treated stuttering essentially as a form of 
learned behavior, in which unlearning of the 
stuttering habit could be facilitated by “nega- 
tive practice.” 

An important early study by Van Riper (14) 
bears closely upon the reinforcement prob- 
lem in stuttering. Van Riper investigated 
the effect of punishment on the stuttering 
response, finding that the frequency of stut- 
tering was increased by the administration of 
electric shock following the response. 

The acquisition of new responses in stut- 
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tering, described by Van Riper (15), may be 
expressed in learning terms as follows: 

In attempting to say a difficult word the 
stutterer finds that employing a novel re- 
a sudden intake of breath, 


sponse, such as 


releases the word more quickly due to the 


disinhibiting effect of response-produced 
stimuli. With continued use the device loses 
its disinhibitory properties and becomes in- 
corporated into the characteristic pattern of 
the stuttering. The response loses its volun- 
tary characteristics along with its effective- 
ness, and the stutterer soon finds himself 
gasping automatically with every stuttered 
word. This cycle is repeated as he seeks 
relief in other novel responses. 

The lowa Studies in the Psychology of 
Stuttering, directed by Wendell Johnson 
have investigated various stimulus variables 
of which the stuttering response is a func- 
tion. They have included studies of the 
adaptation effect—the reduction in frequency 
of stutterings with repeated readings of the 
same passage—and of the consistency effect— 
the degree to which the same words appear 
as stutterings from earlier to later readings. 
An excellent summary of these studies, along 
with other investigations of the cues and 
which stuttering is in- 
be found in 


conditions under 


creased or diminished, 
Bloodstein (1). 

In one of the series directed by Johnson, 
with 


may 


Harris (5) concluded his discussion 


this provocative statement: 


Is the adaptation effect to be regarded as a 
type of experimental extinction? This last question 
serves to raise the more general and theoretically 
very important question as to the degree to which 
ind in what particular stuttering is to be interpreted 
is learned behavior. Speech pathologists could 
hardly find a more crucial issue for investigation 

the further study of stuttering. 


Johnson has concluded, from the results of 
the lowa Studies, from the Davis study (3), 
ind from his own investigation of the onset 
and development of stuttering (7), that stut- 
tering as a speech disorder develops after 
diagnosis, i.e., it is a learned anxiety system 
resulting from the evaluative behavior of 
teachers, and others close to the 

\ child becomes a stutterer after 


he has been labelled Johnson has spoken 


pare nts, 
stutter 


one, 
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of stutterers as having “tongues that learn to 
stumble,” not because they are innately 
deviant but because of the nature of their 
semantic environment. 

Again illustrating that approaches to stut- 
tering do not necessarily exclude each other, 
Johnson’s semantic approach is at the same 
time avowedly a learning theory. One of 
Johnson’s students, Shulman (12), concluded 
in summarizing his study of the adaptation 
and consistency effects that "stuttering is pri- 
marily a form of learned behavior.” Another 
of Johnson’s students, Wischner (17), at- 
tempted systematically to interpret stuttering 
data in conditioning terms such as generali- 
zation, extinction, and spontaneous recovery, 
and suggested further study of stuttering as 
learned behavior. 

Somewhat earlier, Hill (6) had independ- 
ently worked out a stimulus-response inter- 
pretation of stuttering behavior within the 
“interbehavioral analysis” of J. R. Kantor. 

Harris’ suggestion, quoted above, that 
adaptation may be a form of experimental 
extinction, has been elaborated by Wischner, 
who has drawn up points of similarity be- 
tween the two. The seeming parallel is 
furthered by earlier findings, such as Shul- 
man’s that “relative massing of the oral 
reading periods is conducive to greater adap- 
tation than would be attained under 
ditions of distributed practice” (12). 

In the standard adaptation situation, the 
stutterer reads the same passage over, and, 
during the successive readings, there is a 
reduction in the amount of stuttering. The 
words “experimental extinction” should not 
be applied to this treatment, however, since 
the decrease in stuttering takes place in spite 
of the fact that the stuttering behavior is 
being continually reinforced. The word “ex- 
tinction” ought to be reserved for the decre- 
ment of a response under non-reinforcement. 

Within a learning approach to stuttering, 
there are of course many possibilities other 
than those related above. For instance, stut- 
tering could be investigated (1) in terms of 
the effect of non-reinforcement on the stutter- 
ing response; (2) as conflict behavior; (3) in 
terms of the functioning of preparatory sets 
or expectancies; (4) in terms of the frac- 
tional anticipatory goal response; (5) in 
maladaptive anticipatory _ startle 


con- 


terms of 
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responses involved; (6) in terms of the role 
of configurations or disorders of perceptual 
organization; (7) in terms of alteration of 
the stuttering pattern—the acquisition and 
dropping out of specific movements within 
the stuttering response; (8) in terms of the 
cues which determine the form of the stut- 
tering block and the moment of its release; 
(9) in terms of a possible decrease in the 
discriminability of instrumental acts under 
increased motivation. It can thus be seen 
that the selection of the present problem is 
arbitrary—others could have been employed 
within a learning approach. 

The present study is concerned primarily 
with stuttering’s reinforcement aspect and 
secondarily with its conflict aspect. Rein- 
forcement has been made the focus of attack 
here because the persistence of stuttering 
behavior is one of its principal mysteries and 
because it is felt that utilization of non-rein- 
forcement techniques has been one of stut- 
tering therapy’s greatest needs. 


STATEMENT OF THE PROBLEM 


Since stuttering involves behavior which is 
apparently more punishing than rewarding, 
why does the behavior persist? Why doesn’t 
the stuttering response extinguish? 

One answer which can be given is that, 
under ordinary conditions, the stuttering 
response is reinforced. As the instrumental 
act by means of which the stutterer is 
attempting to produce the difficult word, it 
eventually does lead to the production of the 
word and so to the termination of the anxiety 
which the word has elicited. The situation 
might be diagrammed as follows: 


Swore Rene: Sens. 


Retutt——G term. seq. 


S,org can be taken to be a printed word 
on a page which has cue value in terms of the 
stutterer’s past experience with similar words 
in similar situations. The word is a stimulus 
to anxiety, which in turn elicits the avoidant 
behavior we call “stuttering.” The goal, G, 
is here defined as the termination of the 
sequence, the point at which the stutterer is 
able to go on to the next word. 

In the case of the normal speaker, how- 


ever, and in the case of the stutterer when 
he has no trouble on the word: 


Sone 


Racrmet sp. attempt Gren. seq. 


Here a normal speech attempt, rather than 
the stuttering response, is the instrumental 
act leading to reinforcement. The assump- 
tion is made that the termination of the 
sequence, the point at which the stutterer 
is able to go on to the next word, is the 
point of reinforcement, and that the instru- 
mental act which terminates the sequence is 
reinforced. 

There are times when a stutterer makes a 
normal speech attempt even in the presence 
of anxiety (15). This is what may under 
favorable conditions happen in therapy. 


Swerg— Rénz—Sanz. 


Raormat ap. att.——Gterm. 80q. 


Here there is no reinforcement of the stut- 
tering response, but the introduction of a set 
which operates in the presence of anxiety- 
producing cues to elicit a normal speech 
attempt. This is the kind of training that 
can reduce stuttering on a permanent basis. 
Older forms of treatment failed to achieve 
this because they merely tried to increase the 
number of normal speech attempts by pre- 
venting anxiety through “confidence” meas- 
ures, but gave the stutterer nothing to help 
him deal with anxiety when it was elicited. 

The crucial feature of the above chain of 
events is that the instrumental act leading to 
the termination of the sequence is the normal 
speech attempt. However, a stutterer ordi- 
narily does not respond to anxiety with a 
normal speech attempt. Usually, stuttering 
is the response which gets reinforced. In the 
present study the experimental group is 
given a set which does not permit the termi- 
nation of the sequence until a normal speech 
attempt has taken place. Ordinarily, the 
goal would be simply to speak and be under- 
stood; the stuttering would be incidental, in 
fact, necessary, to the attainment of that goal. 
The experimental set here changes the goals 
so that the stutterer is no longer rewarded 
just to speak and be understood. The new 
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goal involves making a normal speech 
attempt before leaving the word, no matter 
how much stuttering takes place prior to 
that point. 


S. or @——Ranz— Sane. 
Retute 


If we can succeed in giving the stutterer 
a set which will permit non-reinforcement 
of the stuttering, it will be a tremendous lead 
for therapy. The set in this experiment is 
something which the inexperienced therapist, 
or even the child or the outpatient, could 
utilize. Many cases of adult stuttering are 
believed to be on a self-maintaining or func- 
tionally autonomous level. For these cases, 
finding a means for preventing continued 
reinforcement of the stuttering response 
should be especially helpful. 

The reinforcement implications of our ex- 
periment may be further developed by a brief 
sketching of its conflict aspect. 

Stuttering may be considered a resultant 
of opposed urges to speak and to retreat from 
the speaking of the word feared. This 
conflict aspect of stuttering, which is dis- 
cussed more fully below, may be included in 
our diagram: 


Swore wRans. Savoia 
‘ 
4 
be Retutt. 


s* 


7 ” “ 
a7 ‘“ 
Soit —>R, a —S approach 


Following first the solid arrows, we have 
again Syorg, aN anxiety-arousing cue which 
leads to an avoidance drive, and opposing 
this we have S,,,, the stimulus of the situ- 
ation which calls for the response of speaking 
and leads to an approach drive. The result- 
ant stuttering, terminates the 
sequence and is reinforced. This is the 
nature of the conflict in word-fear. 

By following the broken arrows, it will 
be noted that S,,,,¢ and S, now have ex- 
changed places in terms of cue value, so that 
Swora becomes a stimulus to speak and Sy; 
becomes a stimulus to anxiety and avoidance. 
The resultant response, stuttering, gets rein- 
forcement as before. This is the nature of the 
conflict in situation-fear. 

It should be noted further that S,.o,¢ by 


respe nse, 


Germ. seq. 
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itself can evoke conflicting responses. Ordi- 
narily a word is a stimulus to speak; but 
when word-fear is present, the same word 
can also be a stimulus to anxiety and avoid- 


exper. eet ——Raorm. sp. att. Germ. seq. 

ance. S,i,, too, can in itself evoke conflicting 
responses. As every stutterer knows, a situ- 
ation can demand speech but at the same 
time hold enough threat so that there is a 
competing drive to hold back from speaking. 
The resultant between the opposed urges to 
speak and not to speak is in each instance 
the response of stuttering, and in each 
instance it gets reinforcement. 

Suppose we insert in this sequence, by 
means of an experimental set, a normal 
speaking of the word at a point between 
the stuttering and the termination of the 
sequence. Now the approach response, that 
of speaking the word normally, is strength- 
ened, while the avoidant response is moved 
farther away from the point of reinforcement 
and is correspondingly weakened. We would 
predict from our formulation that such a 
technique would lead to more normal speech 
and less stuttering. 

The general hypothesis of this study is 
that stuttering is reduced most rapidly under 
conditions which permit least reinforcement 
of the stuttering response. 

Specific hypotheses may 
follows: 

1. There will be greater decrements in the 
frequency of stuttering for the experimental 
than for the control conditions described: 


be stated as 


(1) Control: each subject reads the passage 
six times in his characteristic way; (2) Ex- 
perimental: the subject reads the passage five 
consecutive times, repeating each stuttered 
word until he has said it once successfully, 
without stuttering, before he goes on to the 
next word and then reads it a sixth time as 
he normally would. 

2. The difference between the experimental 
and control conditions will persist during a 
30-minute interval. 

3. The experimental treatment will reduce 
the “consistency effect,” i.e. the particular 
words stuttered on in the first reading will be 
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stuttered on less during subsequent readings 
in the experimental or non-reinforcement 
condition than in the control condition. 


METHOD 

Subjects 

The subjects (Ss) used were 20 adult stut- 
terers ranging in age from 18 to 35. Thirteen 
were undergoing treatment in the Speech 
Clinic of the University of Iowa and three 
in the Speech Clinic of Western Michigan 
College, Kalamazoo. Four who were resi- 
dents of Ann Arbor, Michigan, were not and 
had not been undergoing treatment. Six of 
the 20 subjects were females. In addition, 
five Ss were lost as cases for this study when 
the wire recorder broke down during their 
readings. 


described views on heredity and will be 
referred to as the “Heredity” passage. 


Experimental Design 


Twenty  stutterers participated. Each 
served as his own control. The order of pas- 
sages, days, and experimental conditions 
within each block of four subjects followed 
the design shown in Table 1. There were 
five such bli 

Thus 10 performed under experimental 
conditions on their first day and under con- 
trol conditions on their second. For the 
other 10 the order was reversed. Each pas- 
sage appeared equally often under the two 
conditions. Although the passages were 
approximately equivalent in length and diffi- 
culty, the design insures that variance aris- 


TABLE 1 


EXPERIMENTAL DESIGN 





Susyectr No. First Day 


Seconp Day 





Experimental “Iron” 
Control “Iron” 
Control “Heredity” 


Experimental “Heredity” 


Control “Heredity” 
Experimental “Heredity” 
Experimental “Iron” 
Control “Iron” 





In accordance with Shulman’s (12) cri- 
terion, subjects were eliminated from this 
study if they did not stutter on at least 2 per 
cent of the words on the first passage which 
they read. Since in a 200-word passage any 
normal speaker might easily bobble on three 
or four words, the 2 per cent criterion is 
fairly conservative. Stuttering cannot be 
studied except when it occurs, and this study 
was designed to investigate stuttering rather 
than “stutterers” as such. Four potential Ss 
were eliminated in this manner. 


Apparatus and Materials 


Apparatus used consisted of a Silvertone 
wire recorder. Two comparable 200-word 
passages were selected from those available 
for experimental purposes at the University 
of Iowa Speech Clinic. Each passage was 
edited where necessary for length and difh- 
culty of material. One passage described the 
history of the making of iron and will be 
referred to as the “Iron” passage; the other 


ing from any differences between them was 
held to a minimum. 


Experimental Procedure 


The subject sat facing a wire recorder with 
the mike resting on a desk about two feet 
away. The experimenter (E) sat behind the 
desk, somewhat to the side of the subject, so 
that he observed S directly when necessary 
and still operated the recorder. In practice, 
E usually followed a printed copy of the 
passage before him and depended on auditory 
cues for judging moments of stuttering. 

At the beginning of the instructions S was 
handed a copy of the passage, face down, 
and was asked not to turn it over until E 
nodded to him as a signal to begin. E then 
recorded S’s code letters, testing the wire- 
recorder in the process. Next, E read the 
instructions, turned on the recorder, and 
nodded to S, who began reading. 

As S read, E followed an identical copy of 
the passage, underlining each word on which 
observable stuttering occurred. In order to 
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minimize the influence that judgments on 
one reading might have on the next, a sep- 
arate copy was used by E for each of the 
seven readings. This had the additional 
advantage of reducing chances of error in 
the tabulation of results. 

The seven readings may be summarized 
as follows: In both conditions the first five 
readings were consecutive with S$ returning 
immediately to the beginning of the passage 
after each reading. The experimental and 
control conditions differed only in the nature 
of the experimental set. There was in each 
instance an interval of 10 to 15 seconds be- 
tween the fifth and sixth repetitions of the 
passage for the reading of instructions. 
Thirty minutes intervened between the sixth 
and seventh readings. 


Instructions 


Instructions for the experimental readings 
were as follows: 


[Before Readings 1-5.] You are to read this pas- 
sage aloud, five times in succession. I would like 
you to read for me at your normal rate, as well as 
you can, and in the way that is for 
you at this time. When you stutter on a word, 
repeat the whole word until you can say it without 
stuttering. Do this before going on to the next 
but do not interrupt yourself in the middle 
of a block. That is, be sure to finish the word first 
before repeating it, and be sure to repeat it until 
say it without stuttering before going on 
to the next Do not “fake” any stuttering 
that would not otherwise occur. Read the passage 
through in the manner that is most natural for you, 
repeating every stuttered word until you have said 
[ Pause.] 
in if the sentence were, “Once upon 
there was a young rat named Arthur,” and 
“once” and “time” it would be like 

On-Once upon a ti-ti-ti-time 
time etc.” [E tests S see 
I out the instruction 
g 6.) Now just read the pas- 
inarily would, without the special 


ost natural 


word, 


you can 
word. 


it once normally 
For tance, 

: 
ittered on 
n-On-Once 


ti-time to 


Reading 6, S was told to return in 30 
: session, meanwhile to refrain 
in conversations and to spend the 


short 


7.| Read the passage as you did 


hout the special instructions. 


tions for the control-reading were: 


Ins 
Before Readings 1-5.] You are to read this pas- 
» aloud, five times in succession. I would like 
4t your normal rate, as well as 


sage 


you to read for me 
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you can, and in the way that is most natural for 
you at this time. Do not ‘fake’ any stuttering that 
would not otherwise occur. Just read the passage 
through in the manner that is most natural for you. 

[Before reading 6.] Now read the passage once 
more just as you have been doing. 

| After Reading 6, subject is told to return in 30 
minutes for a short session, meanwhile to refrain 
from engaging in conversations and to spend the 
period studying.] 

[Before Reading 7.| 
the last time. 


Read the passage as you did 


Method of Tabulating and 


Consistency Effects 


Adaptation 


The frequency of stuttering in each reading 
was tabulated by counting the number of 
underlined words on each copy of the pas- 
sage. Seven sheets were used per S for each 
condition. 

In tabulating the consistency effect, carbons 
were inserted between each of the pages for 
the seven readings. Since the first reading 
was to be used as a base, all words which had 
been stuttered on in the first reading were 
marked again, using a stylus, which marked 
these words on all remaining sheets. The 
second markings could be distinguished from 
the original judgments because they appeared 
in a different color and because a diagonal 
stroke was used. 

For each reading, the number of words 
which had been marked twice was counted. 
The twice-marked words for any passage 
represented the particular words stuttered on 
in that reading which were also stuttered on 
originally in the first reading. The fre- 
quencies were then converted into percent- 
ages, using the frequency in the first reading 
as a base of 100. Thus 45 per cent on the 
fourth reading for the control condition 
would mean that, of the words stuttered in 
the first reading, 45 per cent had also been 
stuttered in the fourth reading. 


Check on Observer Reliability and Expert- 
menter Bias 


To handle the problems of observer relia- 
bility and experimenter bias, each S was 
assigned code letters which were recorded 
along with his readings, e.g., “AFB,” “BRL.” 
Thus on the playback of the sixth and 
seventh readings, it was impossible for a 
listener to tell whether a control or experi- 
mental reading was being played. 
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Observer reliability was checked in this 
manner: During the playback of all the 
seventh readings, E made second judgments 
using fresh copies of the passages. A quali- 
fied speech correctionist, who acted as an 
observer, made independent judgments in the 
same manner. 

Pearson product-moment correlations were 
calculated with these results: the r between 
the second judgments of E and the observer 
listening to the playback was +.96, while 
that between E’s two sets of judgments was 
+.93. The highest obtained coefficient was 
+.98, that between the observer’s judgments 
and E’s original judgments. The latter rep- 
resent first listenings for each. Table 2 sum- 


The hypothesis of no difference between 
the experimental and control conditions, 
which is the null form of the first specific 
hypothesis given above, may be rejected at 
the .05 level. This finding is in accord with 
our prediction, #iz., that substituting an 
attempt at normal speech for the stuttering 
response at the point of reinforcement would 
lead to a greater decrement of stuttering in 
the experimental condition. 

From Tables 4 and 5 it can be seen that 
stuttering decreased significantly with suc- 
cessive readings for both experimental and 
control treatments. Differences between 
Readings 1-3 and 1-5 were significant beyond 
the .or level. It may be noted that these 


TABLE 2 


OsseRvER RELIABILITY COEFFICIENTS 


LIsTENER 


JUDGMENT 


CORRELATION BETWEEN JUDGMENTS 











Experimenter (original) 
Experimenter (from playback) 


Observer (from playback) 





marizes these results. These correlations 
were sufficiently high so that E’s original 
judgments could be used throughout with- 
out appreciable danger of influencing results 
through experimenter bias. 


RESULTS 


A comparison of the frequency of stutter- 
ing for all seven readings is presented in 
Figure 1. 

The effect of the experimental variable is 
clearly in evidence in the first reading. By 
the third, the ¢ for the difference between 
experimental and control conditions is 2.44, 
which with 19 degrees of freedom is signifi- 
cant beyond the .05 level. Further statistical 
comparisons of the differences between ex- 
perimental and control groups are sum- 
marized in Table 3.” 


2 The #’s in this table were obtained by setting up a 
distribution of differences and applying McNemar’s 
formula 92(19) for small samples involving means 
based on the same individuals. 


0--0 CONTROL 
&—-0 EXPERIMENTAL 


STUTTERINGS 








T 


3 4 
READINGS 
Fic. 1. FREQUENCY OF STUTTERING THROUGH Suc- 


CEssIVE READINGS IN EXPERIMENTAL Con- 
DITION AND CONTROL CONDITION 


Differences between the two conditions for Read- 
ings 3 and 4 were significant at the 5 per cent level. 








5 


t-values are higher even though the actual 
differences are smaller than those between 
experimental and control treatments for the 
same readings. This result is a function of 
the comparatively high op’s in Table 3 and 
the comparatively low op’s in Tables 4 
and 5. ‘This, in turn, is a function of higher 
correlation between readings of the same 
material on the same day than between read- 
ings of different material on different days. 

The difference between experimental and 
control Ss on the first reading is not statisti- 
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cally significant (P<.20). For those who 
raise the possibility that with a greater N it 
might have been significant, the following 
interpretation is offered. An attempt was 
made to demonstrate the possible operation 
of the experimental set within the first 
reading. 

The first reading for each condition was 
divided into four 50-word quadrants. The 


frequencies were tallied in each quadrant, 
and the results seem to show a faster drop 
within the passage for the experimental than 


TABLE 3 


SIGNIFICANCE LeEvELS OF DIFFERENCES IN FREQUENCY OF STUTTERING BETWEEN EXPERIMENTAL AND 
ContTROL CONDITIONS 


READING 


4 





—-L.uUsS 
NNNU 


.60 
-95 
.65 
-990 
-95 


-10 








Mc and Me refer to means for 


freedom 


SNIFICANCE LEVELS OF DIFFERENCES BETWEEN READINGS WITHIN THE CoNTROL CONDITION 


READINGS READINGS 


SIGNIFICANCE 


READINGS READINGS 


Levers oF DIFFERENCES BETWEEN READINGS WITHIN THI 


READINGS 








EXPERIMENTAL CONDITION 


READINGS READINGS 





Da 
op 
: 
pe 





* Degrees of freedom—19 
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for the control data. This difference, which 
may be seen in Figure 2, was not, however, 
statistically significant. 

It should be noted that there was no con- 
trol of the relative difficulty of different 
quadrants, an effect only partly alleviated by 
summing data from two different passages. 
It seems reasonable, however, that if there is 
a differential drop in Figure 2, it is largely 
a function of the operation of the experi- 
mental set, and that the small differences in 
the first quadrant might also be attributed 
to this condition. 


8 


STUTTERINGS 


0--0 CONTROL 
6—4 EXPERIMENTAL 








T T 


2 3 
QUADRANTS 


DéECREMENTS FOR THE First 
READING 


Fic. 2. Intra-PAssact 


Numbers 1, 2, 3, and 4 refer to successive 50-word 
quadrants. The faster drop in the experimental 
group is interpreted as being due to the operation 
of the experimental set within the first reading. 


The second specific hypothesis concerned 
An estimation of the 
relative persistence of the effect of the ex- 
perimental and control conditions may be 
had by reference to the initial and final 


persistence of effect. 


levels of stuttering for each group. The 
reduction in stutterings between the first 
and seventh readings in the experimental 
group was significant beyond the .o1 level 
(t=4.11), while the corresponding difference 
for the control group was not statistically 
significant (t=1.02). The difference be- 
tween the two conditions on the seventh 
reading was significant beyond the 10 per 


cent level. That these differences could 


occur despite the fact that the experimental 
set was operating within the first reading 
and so reduced the mean difference between 
the first and last readings, strongly suggests 
the decrement in stuttering under the non- 
reinforcement (experimental) set has rela- 
tively persistent effects. Extinction of the 
stuttering response under the experimental 
set, in other words, was not only faster but 
more lasting in its effects. 

The third specific hypothesis referred to 
the differences in consistency between ex- 
perimental and control treatments when 


0--0 CONTROL 
6—4 EXPERIMENTAL 


PERCENT STUTTERED WORDS 








T T 


3 4 
READINGS 


DIFFERENCES IN CONSISTENCY BETWEEN THE 
Two ConpirTIONs 


Fic. 3. 


Each point of Readings 2 through 7 represents the 
percentage of stuttered words in the first reading 
also stuttered on in that reading. Significant at 
2 per cent level. 


words stuttered upon during the first reading 
were used as a base. These results are pre- 
sented graphically in Figure 3. 

An extremely large number of cases would 
be required to demonstrate the statistical 
significance of a difference between one pair 
of obtained percentages. However, since the 
design of the experiment virtually rules out 
any consistent difference between experi- 
mental and control data other than the 
operation of the experimental set, persistence 
of a difference can become a criterion of 
significance. If the difference between the 
points on the two curves in Figure 3 were 
attributable to chance alone, we would ex- 
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pect the percentage for the experimental 
condition to be lower in all six tests after the 
first, only once in 64 times. Thus the null 
form of the third hypothesis can be rejected 
beyond the 2 per cent level of confidence. 

It should be noted that since spontaneous 
recovery was tested under control conditions 
for the experimental group, the experimental 
effect had to compete against the S’s estab- 
lished cues in a straight reading situation. 
A greater difference would be demonstrated 
if recovery were tested under experimental 
conditions. 


Discussion 


Stuttering has sometimes been described 
as a hodgepodge, a maze ol contradictory 
behavior, unpredictable by its very nature. 
In terms of general theoretical significance, 
the finding that stuttering behavior can be 
modified in accordance with principles of 
learning supports the thesis that is is not 
qualitatively different from adaptive be- 
havior, that no different laws of behavior are 


involved. 
It is obvious from, the results that the ex- 
perimental technique employed leads to a 


Not only 


show a 


reduction in stuttering behavior. 
did the experimental condition 
greater decrease in total stuttering through 
successive readings, but a greater reduction 
of stuttering on the particular words stut- 
tered on in the initial reading. In addition, 
the reduction in stuttering was more lasting 
in the experimental condition. 

Of other possible interpretations, certain 
ones are excluded by the nature of the design. 
The idea that distraction was really the agent 
responsible for the improvement is excluded 
by the nature of the results. 

Two positive interpretations have already 
been developed in the statement of the prob- 
lem. First, less stuttering and more normal 
speech appeared in the experimental condi- 
tion because the non-reinforcement technique 
made the normal speech attempt rather than 
the stuttering response the instrumental act 
leading to reinforcement. Second, the experi- 
mental technique was effective because it sub- 
stituted the approach response of speaking for 
the avoidance response of holding back at the 
point of reinforcement. It thus reduced the 
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source of conflict for the stutterer and made it 
easier for him to “go ahead and speak.” 

The interpretation of stuttering as conflict 
behavior may be developed more fully at this 
point. Johnson (15) has described stuttering 
as an avoidance reaction, a view which is 
fairly logical since it is a response to painful 
stimulation. Normal speech, of course, in- 
volves approach behavior in that the speaker 
“approaches” or attacks directly the word 
which he is to say. Stuttering is manifested 
by a fear of or an avoidance of the act of 
speaking. The stutterer is in a conflict situa- 
tion because he has a tendency to approach 
the word, i.e., he needs to say it, but he has 
a competing tendency to avoid the word be- 
cause of fear that he may stutter on it. Stut- 
tering can in this fashion be looked upon as 
approach-avoidance conflict, the resultant of 
the opposing urges to speak the word and to 
hold back from speaking it. We know from 
Miller’s (10) investigations that the 
avoidance gradient is steeper than the ap- 
proach gradient, an organism put into an ap- 
proach-avoidance conflict situation character- 
istically approaches part way and then stops. 
This is exactly what the stutterer does on a 
word; he goes part way and then stops and 
goes back—he says “K-K-K-Katy.” 

If the foregoing formulation may be re- 
garded as essentially correct, then we can 
more fully interpret the function of our ex- 
perimental variable. The approach response, 
that of a direct normal speech attempt on 
a word, is reinforced, but the avoidance re- 
sponse, that of stuttering or holding back, is 
not. The experimental set forces the stut- 
terer to attack a feared word until he has 
“conquered” it; in other words the approach 
tendency is so strengthened and the avoid- 
ance tendency so weakened that he no longer 
stops part way through the word. He can, 
therefore, say it fluently, is rewarded for 
doing this, and is able to read the passage 
next time with significantly fewer stutter- 
ings. The formulation is entirely consonant 
with the “instrumental act” interpretation and 
would involve identical predictions, since 
modification of reinforcement is the essential 
feature of each. 

It should be emphasized that the important 
contribution of this study is not the specific 


since 
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technique which is involved—although this 
has been shown to have definite possibilities 
but the finding that stuttering can be sys- 
tematically reduced through modification of 
the conditions which ordinarily lead to its 
continued reinforcement. Further modifica- 
tion of techniques in speech correction it 
accordance with predictions of learning 
theory now appears feasible. 

From the standpoint of planning stuttering 
therapy, it appears from this study that the 
point of reinforcement of the stuttering re- 
sponse should become a foremost area for 
attack on the problem. Instead of content- 
ing himself with speech drills or distraction 
rituals, which by their nature give only tem- 
porary fluency, the progressive therapist of 
the future can put at the stutterer’s disposal 
techniques which will prevent further re- 
inforcement of the stuttering response and 
lead to improvement on a more permanent 
basis. 

Other techniques which would seem to 
hold potentialities in terms of non-reinforce- 
ment may be mentioned: (1) Use of a smooth 
prolongation or “slide” at the point of re- 
inforcement. (2) Use of a “bounce” provided 
it is carried beyond the moment of release. 
The author does not feel that a “bounce,” or 
voluntary repetition, is helpful when it simply 
becomes a device for helping the stutterer say 
the word. (3) Any system which encourages 
the stutterer to attack feared words and sit- 
uations and which decreases avoidance ten- 
dencies should be effective, since it would 
reduce conflict behavior. 

While in terms of preventing reinforce- 
ment of the stuttering response, the experi- 
mental set used here is only one possibility, 
the relative simplicity of this technique and 
its demonstrated value recommend it as a 
valuable clinical tool. 

It should be emphasized that non-reinforce- 
ment involves direct work on the stuttering 
block itself. It involves a direct attack by 
the stutterer on his problem and on his fears. 
The non-reinforcement technique should 
never be confused with distraction devices 
which encourage the stutterer’s avoidance 
tendencies and depend upon the necessarily 
temporary effects of disinhibition for their 


usefulness. That the results of this study 


could not possibly be due to distraction is 
shown by the fact that successive trials are 
necessary in order that the non-reintorce- 
ment take effect. Distractors or disinhibitors, 
on the other hand, would produce their great- 
est effect initially. 

As a therapeutic technique in stuttering, 
non-reinforcement must be fitted into a cer- 
tain context without which its proper role 
in treatment cannot be understood. Prob- 
ably the most suitable context would involve 
a general program of training in non-avoid- 
ance and a direct attack on the stuttering 
block, within a general program of personal- 
ity readjustment. 

No implication should be drawn from the 
foregoing discussion of the applicability of 
non-reinforcement to the treatment of stut- 
tering, that psychotherapy should be slighted, 
nor that individual personality factors should 
be ignored, nor that all cases should be 
treated in the same way. The application is 
to only one phase, but a very crucial phase, 
of stuttering treatment. Nearly all therapists 
working extensively with stutterers find it 
necessary at some stage in treatment to work 
on the symptoms themselves, namely, on that 
part of the disorder which presents itself 
as a self-perpetuating habit. In this aspect 
of stuttering treatment, non-reinforcement 
would appear to be far superior to techniques 
currently used. In short, to those who treat 
stutterers and find it necessary to work on the 
habit—here is a more effective means for 
doing it. 

The particular non-reinforcement technique 
employed in this experiment was chosen for 
several reasons: 


(1) It involved a clear-cut case of sub- 

stituting a normal speech attempt for 
the stuttering response as the instru- 
mental act leading to reinforcement. 
It could be carried out without difh- 
culty by naive subjects; thus, if it 
demonstrated therapeutic possibili- 
ties, it would be available to out- 
patients, children, and others unable 
to undergo intensive treatment. 
It involved a specific test of the effect 
of the much-maligned parental tech- 
nique of telling the child to “stop 
and say it over.” 
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For years we have been telling parents 
that it was wrong to stop Johnny and have 
him repeat the word. We have always 
thought undesirable, because it meant 
calling attention to the stuttering and ag- 
gravating it by putting a penalty on it (16); 
have 


this 


and, indeed, studies of such 
tended to conclude that it makes the victim 
worse (3). Yet in apparent contradiction to 
this, we occasionally see people who relate 
that at time thev their 
parents broke them of the habit by making 
them “say it over until they got it right.” 
In sier to 
account for stuttering than for its disappear- 


practice 


one stuttered, but 


such cases it has always been e: 
ance. 
clear explanation as well as a testable hypoth- 
esis: Those children who continued to stutter 
were those who did not follow out the in- 


simply experienced them as a 


The present results suggest a possible 


structions but 
penalty; while those who “outgrew” or “over- 
came” stuttering who 
their stuttering behavior, however uninsight- 
fully, in accordance with the extinction prin- 


were the se altered 


ciple demonstrated in this study. 


SUMMARY 


Stuttering persists even though it is more 


punishing than rewarding in the long run, 
because under ordinary conditions the stutter- 


ing response is continually reinforced. The 
assumption was made that the point at which 
the stutterer is able to go on to the next 
word is the point of reinforcement, and that 
stuttering is the instrumental act receiving 
reinforcement. 

This experiment was designed under the 
general hypothesis that stuttering is reduced 
most rapidly under conditions which permit 
least reinforcement of the stuttering response 
and most reinforcement of the normal speech 
attempt. 

Twenty adult stutterers read two 200-word 
passages on two different days and acted as 
their own controls. Under the control condi- 
tion each S read the passage six times in his 
characteristic way. Under the experimental 
(non-reinforcement) condition S read the 
passage five consecutive times, repeating each 
stuttered word until he had said it once with- 
out stuttering before he went on to the 
next word, and then read it a sixth time as 
he ordinarily would. Under both conditions, 
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a seventh reading followed a 


interval. 

The experimental set did not permit ter- 
mination of the sequence until a normal 
speech attempt had been made. Since 
this case stuttering as an instrumental 
for producing the word is farther removed 
from the point reinforcement than _ the 
normal speech attempt, we have produced 
non-reinforcement of the stuttering behavior. 

The specific hypotheses being tested were 


30-minute rest 


in 


act 


of 


supported by these findings: 

Bs Stuttering was found to decrease more 
rapidly through successive readings in the 
non-reinforcement condition. 

2. The experimental set resulted in a more 
rapid decrease in stuttering on the particu- 
lar subjected the 
ment treatment. 

3. The greater reduction in stuttering be 
under non-reinforcement 
lasting over a 30-minute 


words to non-reinforce 


conditions 


rest 


havior 
was 
interval. 

The non-reinforcement technique was held 
to be more effective in reducing total stutter- 
ing behavior because it substituted the normal 
speaking of the word for the stuttering re- 
sponse at the point of reinforcement. The 
approach response of speaking was thereby 
strengthened while the avoidant response of 
stuttering was correspondingly weakened. 
As was predicted, the experimental set led to 
more normal speech and less stuttering. 

The finding that stuttering can be modi- 
fied in accordance with principles of learn- 
ing supports the thesis that it is not quali- 
tatively different from adjustive processes and 
that it involves the same laws of behavior. 

The modification of stuttering through 
non-reinforcement should become an area 
of first importance in therapeutic attacks on 
the problem. The experimental set employed 
here commends itself to the clinician, not 
only because of its simplicity but because of 
its demonstrated effectiveness in the system- 
atic elimination of stuttering behavior. 


more 
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SOCIOMETRIC STATUS AND INDIVIDUAL ADJUSTMENT AMONG 
NAVAL RECRUITS '° 


BY ROBERT L. 


Northue 


anious studies suggest the existence of 
relationships between group 
ance of an individual, measured socio- 
metrically, and other indications of individual 
adjustment (e.g., 2, 4). Careful investigation 
of such relationships seems likely to throw 
considerable light both on the factors under- 
and on the 


accept- 


lying personality disturbance 
processes of group behavior. 

The present study of naval recruits was 
designed to investigate the relationships be- 
tween sociometric status of individuals within 
recruit companies during basic training, and 
indices based on three types of data presumed 
to be relevant to individual adjustment: 
neuropsychiatric examinations, illness (Sick 
Bay attendance), and disciplinary offenses. 
It was hypothesized that each of these indices 
would correlate negatively with status deter- 
mined with reference to one or more socio- 
metric criteria. Because it was felt that the 
findings might have implications for screen- 
ing or classification procedures, the study 
was designed to determine, in addition, how 
early in training the expected relationships 
might appear.” 


PROCEDURE 


The Sample 


The sample comprised 16 companies of 
naval recruits starting basic training at the 
Great Lakes Naval Training Station during 


the early part of 1949. Most of these men 
were between 17 and 20 years of age and had 
completed from one to four years of high 
At the time of its formation each 
company contained 60 men. This number 
fluctuated slightly during the 10-week course 
of training as men were transferred out or 
were replaced by transfers from previous 


school. 


1 This 


basic research spons 


stuc d out as part of a program of 
wed by the Office of Naval Research 
I x<iometric methods might pro- 

existing techniques of neuro 
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wersity 


companies. Such transfers were occasioned 
almost entirely by illness requiring more than 
five days’ hospitalization. The number com- 
pleting training without transfer varied in 
this sample from 42 to 56 men per company, 
with a mean of about 50. 


Sociometric Data 

Sociometric data were obtained by use of a 
brief questionnaire asking each man to give 
the names of men in his company whom he 
would most want to consider in choosing 
another man in each of three situations. 
These choice criteria were: (1) a man to go 
on liberty with; (2) a man to volunteer 
with for a particularly tough and dangerous 
mission; (3) a man to nominate for the 
job of Acting Chief Petty Officer (the top 
recruit job in the company). In each instance 
the subject was encouraged to name as many 
or as few men as he cared to. 

This information was obtained from four 
companies at the end of the first week and at 
the end of training (the tenth week); from 
four others at the end of the second and tenth 
weeks; from four at the end of the fifth and 
tenth weeks; and from four at the end of the 
tenth week only. 

Sociometric status scores were computed 
for each individual simply by counting the 
choices he received from other men in the 
company.® This was done for each of the 
three choice criteria separately, and for the 
three combined. 


4djustment Data 

Records of neuropsychiatric examination 
were available for only 23 cases in the sample. 
These were men who had received more than 


the routine screening examination at entrance 
3 All choices received were included, regardless of the 

fact that the number of choices given varied considerably 

rom one man to another. Available evidence (2) indi- 

cates that this gives a better status measure in terms of 

correlation with a paired comparisons technique than 

does the more common practice of specifying a particular 
mber of choices. 
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but had not been discharged, or who were 
referred to neuropsychiatry during training. 

Data were obtained for each recruit con- 
cerning the occasions on which he reported 
to Sick Bay, time spent there, and diagnosis. 
On the average one-third of the men starting 
with a company attended Sick Bay at some 
time or other, and one-fourth of those who 
completed training in their original company 
had attended at least once. 

Disciplinary records were likewise obtained 
for each recruit, showing offenses com- 
mitted, if any, and number of demerits in- 


week measures. By the end of the fifth week 
these correlations rise to between .80 and .go. 
This provides indirect evidence, incidentally, 
concerning the high reliability of the status 
measures. It will be noted also that status 
on the “leader” criterion tends to be most 
consistent and that on the “liberty” criterion 
least so. 


Status in Relation to Neuropsychiatric 


Record 
Due to the small number of subjects for 
whom neuropsychiatric data were available, 


TABLE 1 


CoRRELATIONS BETWEEN SocioMETRIC STATUS ON INiTIAL TesT AND ON Finat Test (TentH Week), 
spy CoMPANIFS 








NUMBER OF 
Company * 


WEEK oF 


InitiaL Test LIBERTY 


Cuoice CRITERION 
MIssION LEADER 





First 102 38 
103 -49 

104 -63 

105 .28 

Mean t .46 


98 43 
99 -40 
100 .62 
101 .56 
-51 
-76 
.82 
-82 
-78 
. 80 


-53 -63 
-72 -69 
-79 -83 
-40 -35 
-64 -66 
.48 .60 
36 -54 
-57 - 83 
65 -90 
-53 -76 
89 . 88 
81 .87 
.82 .82 
-92 -96 
.87 .90 





* N's vary from 42 to $4 Men per company 


tr corresponding to mean Fisher z 


curred for each. On the average 42 per cent 
of men completing training with their orig- 
inal company were so disciplined. 


REsuLTs 

Consistency of Status Scores 

Since one object of the study was to deter- 
mine how early these measures might have 
predictive value, it is of interest to note indi- 
cations of their consistency throughout the 
training period. Table 1 shows correlations 
between the results of initial and final socio- 
metric tests for the 12 companies tested twice. 
It will be seen that even by the end of the 
first week status differentiation has achieved 
a degree of stability sufficient to yield average 
correlations of from .46 to .66 with the tenth- 


it is impossible to test satisfactorily the 
presence of a relationship. The trend seems, 
however, to be in the expected direction. 
Thus for the 23 cases the median number of 
total choices received was five in contrast 
with the median value of 9.4 for the entire 
sample on the final test. Six of the referrals 
among these cases were finally either dis- 
charged or hospitalized. These showed a 
median of one choice received. 


Status in Relation to Sick Bay Attendance 
Data on Sick Bay attendance were ana- 
lyzed first simply in terms of whether or not 
men reported to Sick Bay during training. 
Few reported more than once. Each status 
variable was likewise dichotomized because 
the distributions of status measures were 
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frequently skewed. In the case of each choice 
criterion, the cut was made as near to the 
median for the entire group as possible; 
between 4 and 5 for “liberty,” 2 and 3 for 
“mission,” o and 1 for “leader,” and g and 10 
for “total.” The four-fold table exhibiting 
each relationship was then tested by chi- 
square. In order to gain some idea of the 
size of the relationships, product-moment 
coefficients were estimated from _phi-coefhi- 
Table 2 shows these estimated 7’s, 
pooled results for 


cients. 
based in most cases on the 
four companies. 
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for the difference in N’s between initial and 
final tests. 

Several points of interest may be noted in 
Table 2. First, the relationships are with 
one exception negative, and many signifi- 
cantly so, indicating that men receiving fewer 
choices tend more frequently to go to Sick 
Bay. Second, with respect to the particular 
choice criterion, significant relationships are 
found rather consistently for “liberty,” in 
some cases for “mission,” but in no instance 
for the “leader” criterion. Third, significant 
relationships are found on the initial test, 


TABLE 2 


RELATIONSHIPS BETWEEN SOCIOMETRIC 


Status AND Sick Bay ATTENDANCE t 





CRITERION 93-101 


Recruit CompaNies 


94-97 


90-93 


ALL (90-105) 





Initial Test t 


(2nd wk.) 
Liberty —.31°* 
Mission 
Leader .09 
Total —.12 


wk.) 
28° 


—.16 


(5th 


»23° 


Final Test (Tenth week) § 


Liberty —.12 —.25° 
Mission —.11 —.15 
Leader .O1 —.09 
Total .04 -.13 


—.29* 
—.26* 
-—.04 
—.38°* 








ps measured in terms of r's estimated 
N for 4 pany groups on Iniual 
§ Total N on Final Test is 797 
® Corresponding chi-square significant at 5 per cent level 
®® Corresponding chi-square significant at 1 per cent level 


¢ Relationsh 


Test varies from 208 to 


It should be noted that data are included 
here only for men who joined one of these 
companies at the time of its formation and 
remained with it until the time of the socio- 
metric test in question; men transferring into 
the company after formation are not in- 
cluded. Thus all men considered had an 
equal opportunity to achieve sociometric 
status. The fact of being ili prior to the test 
could not impair this opportunity appreciably, 
for hospitalization of five days or more 
resulted in transfer out of the company, and 
relatively few of the remaining men who 
went to Sick Bay were hospitalized at all. 
It should be noted further that the initial test 
data include men who were subsequently 
transferred out, in addition to those who 
remained throughout training. This accounts 


from phi-coef 


N for 4-company groups varies from 


even as early as the first week, but only for 
“liberty,” among the single criteria. In some 
cases, notably in companies 102-105, the 
initial test relationships are higher than those 
obtained on the final test. This seems due in 
large part to the inclusion of men in the 
initial test data who subsequently became 
sufficiently ill to be transferred out of their 
companies. If these cases are excluded from 
the data for companies 102-105, for example, 
the estimated r for the “liberty” criterion 
drops to —.16. 

Since it is reasonable to assume that factors 
related to social acceptance do not have equal 
importance for all forms of illness, the ques- 
tion may be raised as to the possible existence 
of status differences among men given dif- 
ferent diagnoses at Sick Bay. Table 3 shows 
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for men attending Sick Bay the relationship 
of recorded diagnosis to status on the several 
choice criteria. In order to maintain fairly 
adequate cell frequencies, the status variables 
were dichotomized at lower cutting points 
than were employed in the previous analyses, 
and data from all companies were pooled for 
the initial and for the tenth-week tests. 
Table 3 presents the values of chi-square 
obtained in testing the agreement of the two 
distributions of frequencies for each criterion. 
None of these approaches significance. In 
other words, although men not going to Sick 
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out of the company would, so to speak, not 
give all men considered an equal opportunity 
to earn demerits. 

As Table 4 shows, the relationships are 
again almost entirely negative, and in many 
cases significantly so, indicating that men 
receiving fewer choices tend more frequently 
to break the rules or at least to be caught and 
punished. Secondly, the “leader” and “mis- 
sion” criteria are most consistently related to 
disciplinary offenses, although for all com- 
panies combined these correlations do not sig- 
nificantly exceed that for “liberty.” Thirdly, 


TABLE 4 


RELATIONSHIPS BETWEEN SocioMETRic STATUS AND DisciPLINARY OFFENSES t+ 


CHuoIct 
CRITERION 


Recrurt ComMPaANIEs 


94-97 AL (90—-105)t 





Initial Test 


(2nd wk.) 


23° 


wk.) 
.10 —. 
) —.24° 

—.19 


—.20 


iberty 
Mission 
Leader 
Total 


(sth wk.) 
—.15 
a 
— 37" 
=, 35°° 


Final Test (Tenth week) 


Liberty 
Mission 
Leader 


Total 


—.21 
er 
om, 459° 
—.28° 





ships measured in terms of r's estimated from 
N for 4-company groups varies from 
square significant at 5 per cent level. 
nding chi-square significant at 1 per cent level. 


* Relatic 
Total N is 797 
* Corresponding chi 
** Corresp< 


192 


Bay differ in status from those who do, there 
seem to be no significant status differences 
related to diagnosis among men in the latter 
It may be noted, however, that cases 


group. 
of measles tend toward lower status on the 
initial test, and cases of catarrhal fever in the 
same direction on the final test. 


Status in Relation to Disciplinary Offenses 


Data on disciptinary offenses were treated 
simply in terms of whether or not men 
received demerits during training. Table 4 
shows the correlations with sociometric status, 
derived in essentially the same fashion as 
those in Table 2. In this instance, however, 
the analysis of both initial and final test 
results was confined to men remaining with 
their original companies throughout training. 
It was assumed that inclusion in the initial 
test data of men subsequently transferring 


phi- 


ocficients 


to 205 


no significant relationships are found in the 
first week, and one does not appear for the 
“leader” criterion until after the second week. 


Discussion AND FurTHER ANALYSIS 


The data thus far considered confirm the 
hypothesized relationship of sociometric 
status to Sick Bay attendance and discipli- 
nary offenses, and show in addition certain 
variations in these relationships which make 
it worth while to attempt to elaborate 
hypotheses in somewhat greater detail. The 
question of status in relation to neuropsychi- 
atric deviation remains an open one due to 
insufhcient data and will not be dealt with 
specifically in the discussion which follows. 

In attempting to account for the relation 
of such behavior as Sick Bay attendance or 
receiving demerits to group acceptance, it is 
obviously necessary to consider both the indi- 





Sociometric Status AND INDIVIDUAL ADJUSTMENT 69 


vidual and the group situation. Figure 1 
presents a simplified schema of some of the 
possible connections between these elements. 
It is assumed that illness or breaking rules 
represent reactions to the frustrating aspects 
of the military situation, reactions of a kind 
to which the individual’s personality predis- 
poses him. It is not contended, of course, 
that frustration alone accounts for all Sick 


re FRUSTRATING 
! ASPECTS OF 
A: MILITARY SITUATION , 








}DEVIANT BEHAVIOR: 
4 CLLWESS OR : 


Fic. 1. Possiste Factors UNDERLYING RELATION- 
SHIPS OF STATUS To SUCH BEHAVIOR As ILL- 
NEss OR DiIscIPLINARY OFFENSES 
See text for explanation. 


Bay attendance or punished offenses, but only 
that it is a factor underlying the observed 


relationships, which are, after all, not very 


large. The relationship to group opinion 
may come about then in one or both of 
two ways. The frustrated individual may 
become ill or break the rules (Fig. 1, |. 2), 
and this behavior may adversely affect group 
opinion (1. 3); or, while disposed to react in 
this fashion, he may also act in such a way 
in his contacts with his fellows as to affect 
group opinion directly: (1. 4). In either case, 
of course, changes in group opinion must 
depend upon the standards of judgment pre- 
vailing in the group and upon the kinds of 
situations, i.e., choice criteria, to which judg- 
ment is oriented. Finally, the individual's 
behavior may in turn affect his situation, 
either as a direct consequence of one of the 
frustration reactions under consideration 
(I. 6, e.g., being punished), or through the 
effects of his behavior on group opinion 
(Il. 2, 3, 5 or 4,5). These connections com- 
plete a potentially vicious circle, but it 
seems likely that this would in practice be 
interrupted by the individual’s leaving the 
situation (e.g., through serious illness or 


court-martial) or by his inhibiting the unde- 
sirable reaction in the face of adverse group 
opinion or authority. In the latter event, 
adoption of some alternative mode of be- 
havior might be expected. 

Certain aspects of this general interpreta- 
tion can be examined more closely in the 
light of the present data. With reference to 
the possibility of a direct effect of the indi- 
vidual’s behavior upon group opinion (Fig. 1, 
l. 4), the findings already presented suggest 
that this may hold true for Sick Bay attend- 
ance, since significant relationships with 
status were found in the first week (Table 2). 
However, 10 men in these companies (102- 
105) had attended Sick Bay prior to this 
sociometric test, and they must be excluded 
if the hypothesis is to be tested conclusively. 
Table 5 shows the results of eliminating these 
men from the first-week analysis; the esti- 
mated r’s from Table 2 are included for 
comparison. As may be noted, the correla- 
tions are reduced slightly but not below 
acceptable levels of significance. In other 
words, men going to Sick Bay tend to achieve 
lower status before they go. Lower status 
may, of course, contribute a share of the 
frustration presumably resulting in eventual 
breakdown, but in any case this finding 
implies that the Sick Bay-prone individual 
tends to act in ways which result in non- 
acceptance by the group with reference to 
certain kinds of situations. 

In the case of disciplinary offenses this 
kind of relationship again seems to be 
present, although it develops more slowly. 
As Table 4 indicates, significant correlations 
with status did not appear in the first week; 
but several did in the second week, and this 
despite the fact that no demerits had been 
given in these companies (98-101) before the 
sociometric test. Thus the behavior of 
demerits-prone individuals apparently tends 
to result in lower group acceptance in certain 
situations. 

To test the additional possibility that Sick 
Bay attendance or receipt of demerits may in 
themselves affect group opinion is more dif_i- 
cult. The fact that some reduction may 
occur in the correlations with initial test 
status when men exhibiting deviant behavior 
prior to the test are eliminated from con- 
sideration (as in Table 5) does not answer 
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the question, for these men very probably 
represent the more extreme forms of the 
personality constellations involved. An 
attempt has been made to get at this factor 
in the following manner. In the case of Sick 
Bay attendance for companies tested twice 
(94-105), all men were singled out who 
started and finished training in the same 
company, received no demerits throughout 
training, and did not go to Sick Bay before 
their initial sociometric test. Within this 
group, each man going to Sick Bay after the 
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failure to improve. Negative results in 
this instance are perhaps more conclusive 
than positive. There is no way of determin- 
ing whether the men receiving demerits, 
although standing equally with the controls 
on the initial test, may not have subsequently 
encountered difficulties which altered their 
behavior in relation both to their fellows and 
to the regulations. Nevertheless, it seems 
likely that breaking regulations would in 
itself affect the individual’s status. This 
point will be touched on again presently. 


TABLE 5 


RELATIONSHIPS BETWEEN SocioMETRIc STATUS AND SicK Bay ATTENDANCE IN THE 


First WEEK 


(CoMPANIES 102-105) 


All men (N = 224) 
Mer 
( 


N= 214) 


® Corresponding chi-square significant at 5 per cent level 
** Corresponding chi-square significant at 1 per cent level 


initial test was matched by a random pro- 
cedure with another man from his com- 
pany who did not attend Sick Bay, the 
matching being done on the basis of total 
choices received on the initial test. A com- 
parable analysis, confined to men not going 
to Sick Bay during training, was carried out 
to ascertain the effects of disciplinary offenses 
subsequent to initial test. Table 6 shows the 
results of these analyses, which indicate that 
Sick Bay attendance has in itself no effect on 
status but that disciplinary offenses are asso- 
ciated with a significant status decline or 


not attending Sick Bay before sociometric test 


Cuoice CRITERION 


LIBERTY MIssION LEADER 





3° —.19 





Consideration of the findings with respect 
to different choice criteria gives some further 
clues as to the bases of the group’s reaction 
to the individual. It will be recalled that 
Sick Bay attendance was related most con- 
sistently to status on the “liberty” criterion 
and not related significantly to choices re- 
ceived as “leader”; whereas status on the 
“leader” and “mission” criteria showed the 
most consistent relationships to disciplinary 
offenses. This implies that the Sick Bay- 
prone individual tends to be accepted in the 
less personal, group-defined role of leader, 


TABLE 6 


Finat STatus oF 


Men SHow1nc Deviant Benavior AFTER INITIAI 


Test, COMPARED WITH THAT OF MEN 


Nort SHow1ne Itt 


Men not going to Sick Bay 
Men going to Sick Bay 


Men not getting demerits 
Men getting demerits 117 


? See text for expla 
* Significant at per 





MEAN oF Totat CHolcres RECEIVED 


Finat Dirt 





bases for selection of these groups 
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but not in relatively closer, more personal 
relationships such as liberty companion.* 
(The role of mission companion presumably 
stands sornewhere between these two ex- 
tremes.) Since the recruits form a relatively 
homogeneous group within the larger popu- 
lation, their tastes in friends probably have 
much in common at the outset of training, 
and so it is not surprising to find the Sick 
Bay-“liberty” relationship as early as the first 
week, 

Demerits-prone individuals, it appears on 
the other hand, tend to be less well accepted 
in both types of relationships, but more con- 
sistently so in the more impersonal, group- 
defined leadership role. This is also not 
surprising, perhaps, in groups which empha- 
size heavily authority and discipline. Now 
if the bases of judgment of the disciplinary 
offender consist in group norms of this sort, 
then it is reasonable to suppose that overt 
violations of the norms would in themselves 
tend to produce a negative group reaction 
independently of other aspects of the indi- 
vidual’s behavior. (See preceding discussion 
with reference to Table 6.) One wonders 
also whether these “other aspects,” which 
seem to account for the demerits-status rela- 
tionships observed prior to any official punish- 
ment, may not consist primarily in words or 
acts expressing opposition to authority. These 
considerations suggest that the late appear- 
ance of the demerits-status relationship may 
be due to either or both of two factors, time 
required for acquisition of group norms 
relative to leadership and discipline and time 
required to learn which men tend to fall 
short of these requirements. 

In discussing disciplinary offenses, it has 
been assumed that the group’s reaction to 
them would necessarily be negative, and the 
results obtained indicate that in general this 
is the case. Quite possibly, however, where 
resistance to authority becomes widespread 
within a group or sub-group, defiance of 
regulations will have the opposite effect on 
group opinion. This may have accounted 
for some variation in the size of the observed 
relationships in different companies, but it 
has not been possible to find clear evidence 
of such an effect. 

4 The distinction suggested here is akin to that drawn 
by Jennings between the “psychegroup” and the 
“sociogroup ” (3). 


The data provide no very definite infor- 
mation concerning personality differences 
between men who go to Sick Bay and the 
disciplinary offenders. The findings with 
respect to different choice criteria suggest 
certain differences between these groups. 
And it is worthy of note that the two groups 
do not overlap greatly. Of 434 men in either 
or both groups who completed training in 
their original companies, only 98, or 23 per 
cent, showed both forms of behavior. Or, 
stated in another way, the correlation be- 
tween the two is .14. But further under- 
standing of the personality factor and of 
these relationships generally, obviously re- 
quires independent appraisals of the deviants’ 
behavior, together with more information on 
the various other aspects of the situation. 


SUMMARY 


Sociometric tests were given to 16 com- 
panies of naval recruits, numbering about 
60 men per company, at the end of their 
10-week course of basic training; and to 12 
of the companies, in blocks of four each, at 
the end of the first, second, and fifth weeks 
of training. The questionnaires requested 
nominations of liberty companion, co-volun- 
teer on a dangerous mission, and company 
recruit leader. Status scores, representing 
number of choices received on each of these 
criteria, were examined in relation to records 
of neuropsychiatric disturbance, Sick Bay 
attendance, and disciplinary offenses. The 
findings indicate that: 

1. Status within the company is in general 
related negatively and significantly to Sick 
Bay attendance and disciplinary offenses. A 
similar relationship seems to hold for neuro- 
psychiatric cases, but they are too few to 
afford an adequate test. 

2. Within the Sick Bay group no signifi- 
cant status differentiation appears between 
different diagnostic categories. 

3- Sick Bay cases tend most consistently to 
be less acceptable as liberty companions (in- 
terpreted as representing a close, personal 
relationship) but are equally acceptable as 
leaders (interpreted as representing a less per- 
sonal, group-defined role). Disciplinary 
offenders tend to be less acceptable in all 
situations, but more consistently as mission 
companions and leaders. 
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4. Status as a liberty companion relates sig- 
nificantly to Sick Bay attendance as early 
as the first week of training. Relationships 
of disciplinary offenses to status do not ap- 
pear until the second week or later. 

5. Committing a disciplinary offense ap- 
pears in itself to affect group opinion, but this 
is not true of going to Sick Bay. In both 
cases other aspects of the deviants’ behavior 
affect acceptance by the group. 

A general interpretation is presented which 
accords with the observed results and calls 
attention to further kinds of observations 


necessary to adequate understanding of the 
individual-group relationships involved. 
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CLINICAL IMPLICATIONS FOR A MEASURE OF MENTAL HEALTH * 


BY LOUIS L. McQUITTY 


University of Illinois 


INTRODUCTION 

His paper has several interrelated pur- 

poses. (a) It reviews clinical evidence 

and theory in support of a clinically 
significant personality factor. (b) It outlines 
certain discrepancies between clinical theories 
and practices on the one hand and statistical 
approaches toward the measurement of clini- 
cal variables;on the other. (c) It offers an 
approach for bridging the gap between sta- 
tistical methods and clinical evidence. (d) It 
applies the approach in isolating what appears 
to be a clinically significant personality factor. 
(e) It reports evidence on the effectiveness 
of measures of the factor in discriminating 
between mental hospital patients and com- 
munity persons. 


MEASUREMENT AND CLINICAL THEORIES 

Let us first consider some of the discrep- 
ancies between efforts toward the measure- 
ment of mental health on the one hand and 
clinical practices and theories on the other. 
Statistical methods for measurement of men- 


tal health usually rely heavily on the concept 
of symptoms. This is not so much the case 


with clinical practices and theories. Many 
clinical practices and theories of personality 
tend to de-emphasize the importance of 
symptoms. They stress, instead, considera- 
tions that are maintained to be more funda- 
mental (1, pp. 43-44, 47-50; 7, Pp. 263-264; 
II, Pp. 24-25; 23, pp. 367-368; 25, p. 417; 26, 
p. 57). It is often maintained by both the 
practitioner and the theorist that the symp- 
toms can be properly evaluated only when 
the more fundamental conditions have been 
investigated. Taking a lead from this point 
of view, the statistical method herein reported 
for evaluating mental health attempts to tap 
one of these more fundamental conditions of 
which some of the theorists and clinicians 
speak. 

! Research on which this article is based was supported 
financially in part by the WPA in 1939-1941 and in 
part by the Research Board of the University of Illinois 
in 1948-49. Appreciation is expressed to both agencies 
and to Mr. Russel! J. Jessen, Research Assistant 1948-49. 


A briefer statement of this article was read at the 1949 
APA Convention. 


As background material, before outlining 
our statistical approach, it is helpful to 
discuss further some of the discrepancies 
between clinical practices and statistical 
methods. Some clinicians and personality 
theorists are critical of the value of statistical 
methods (1, p. 382; 3, pp. 413-414; 14, pp. 
16-17, 20-21, 67-68; 28, pp. 6-7). It is here 
suggested that the clinician’s reservations 
concerning quantitative psychology are di- 
rected rather exclusively at some of the ways 
in which we have employed statistics. In 
efforts toward personality measurement, we 
have usually applied our methods to the 
evaluation of static concepts, the sum of 
symptoms (e.g., 7, pp. 263-264; 12, p. 27; 26, 
p. 57). This application to static concepts 
does not derive exclusively from some limit- 
ing characteristic of statistical methods. Test 
construction methods can be applied to 
dynamic concepts, and this paper reports an 
effort in that direction. 

The central position of the concept of 
symptoms in efforts to measure mental 
health has influenced the point of departure 
in those efforts. The usual approach is to 
obtain a list of symptoms from the literature 
or from case histories and state these as ques- 
tions for the inventory (15, p. 185). A refine- 
ment is to use only those questions which 
give statistically significant differences when 
applied to samples of different populations, 
such as mental hospital patients and com- 
munity persons (15, p. 187). The score is a 
composite evaluation of answers. It repre- 
sents an unsystematic combination of several 
factors (10, 19), devoid for the most part of 
meaningfulness in relationship to some 
theoretical position concerning the psycho- 
genesis of mental illness—with the possible 
exception of a theory by analogy to medicine. 
The approach to be discussed here for the 
measurement of mental health is based on a 
theory of the psychogeneses of mental health 
and mental illness. The approach hypothe- 
sizes that there is a function in which mental 
hospital patients differ from community per- 
sons and that numerical indices of this differ- 
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ence can be obtained (17). The approach 
assumes that symptoms are a product of both 
specific experiences and the hypothesized 
functional difference in persons. The value 
of symptoms as a measure of individual dif- 
ferences is thought to be restricted because 
of their dependence, in part, on specific situ- 
ations and experiences. This restriction is 
probably less when one measures individual 
differences in a function rather than in 
symptoms. 

Having hypothesized a functional differ- 
ence between hospital patients and com- 
munity persons, a measurement problem is 
to find questions which can be treated sta- 
tistically in a way that will give a measure 
of the level of functioning rather than a com- 
posite score of content. If one thinks of 
intelligence tests as measures of an ability (20, 
pp. 37-38) or a level of functioning (27, pp. 
3-42) rather than a sampling of knowledge 
or content, the same problem is involved in 
intelligence-test construction. In this sense, 


the method of personality-test construction 
here under consideration is more analogous 
to intelligence-test construction than is the 
case for most statistically constructed tests of 
personality. 


Another advantage of the approach herein 
proposed derives from the fact that statistical 
item-selection and scaling can depend ex- 
clusively on the responses of community per- 
sons rather than on differences in responses 
between community persons and mental hos- 
pital patients. Consequently, a difference 
between community persons and mental 
hospital patients found by this method is not 
likely to disappear when the test is applied 
to other samples of patients. This is a par- 
ticularly significant advantage because so 
frequently differences found by other 
methods have largely disappeared when ap- 
plied to other samples (26, pp. 43-44; 32, 
p. 141). 

The problem of test construction, in the 
method herein outlined, is to find content 
items that measure a functional level for a 
wide population of subjects. We may take 
our lead toward a solution of this problem 
from certain clinical practices. In clinical 
evaluations of the mental health status of 
clients, some of us attempt to evaluate their 
responses in terms of degree of integra- 
tion (4; 5, p. 54; 8, p. 236). More specifically, 


we consider the personal statements that the 
clients make and attempt to estimate the 
degrees of integration reflected by them. We 
are handicapped by the lack of an instrument 
which measures the degree of integration 
represented by the personal statements of the 
clients. We have at least two lines of ap- 
proach in an effort to develop such an instru- 
ment. (a) We could attempt to determine 
the degrees of integration reflected in all the 
possible combinations of personal statements. 
This would be much too exhaustive for pos- 
sible consideration at this time. (b) We could 
attempt to develop an index of the degrees 
of integration reflected in various combina- 
tions of answers to questions, each question 
being a multiple-choice item. This is the 
approach pursued in the study herein 
reported. 

An immediate problem in this approach is 
the one of how to determine the degrees of 
integration reflected by answers to questions. 
The particular approach selected determines 
more specifically what particular meaning 
we give to the concept of integration. 

One approach is to make a statistical study 
of the opinions of experts as to the degrees 
of integration reflected in various combina- 
tions of answers. This is the approach ap- 
plied by Grzeda (g) in his doctoral thesis. 
It gave certain results in agreement with the 
approach pursued in the study herein 
reported. 

The method pursued in the present study 
involves empirical determinations of the de- 
grees of integration reflected in various com- 
binations of answers, using community 
persons as subjects. The accepted index of 
integration, for the purpose of study, becomes 
a composite of the degrees of correlation 
between successive pairs of answers. If a 
subject gives highly intercorrelated answers, 
he is assumed to be giving well integrated 
answers. On the other hand if he gives 
answers with low intercorrelations, he is 
assumed to be giving relatively disintegrated 
answers. More technical aspects of the deter- 
mination of the degrees of intercorrelations 
are fully discussed in two previous articles 
by the author (16, 17). Rather than review 
these discussions here, it would seem prefer- 
able to attempt to make the approach more 
meaningful by outlining an analogy in the 
field of mental abilities. 
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MEAsuRES OF SCATTER 


In the field of mental abilities, we have 
the concept of scatter in relationship to intel- 
lectual deterioration and mental illness. A 
subject with a high scatter score is one who 
has given a highly diverse pattern of passes 
and failures. Having failed a number of 
relatively easy items, he passes a number con- 
siderably more difficult. The pattern is 
diverse because, in general, persons who fail 
the relatively easy items do not pass those of 
considerable difficulty. Measures of scatter, 
though often positively related to mental 
health, have never been shown to be highly 
related to it (6, pp. 118-125; 13, pp. 979-988). 
The scatter is frequently interpreted as due 
to a bifurcation of the intellectual ability 
level. It is maintained that the level for 
reasoning falls below that for memory of old 
material. If we accept this interpretation, it 
would seem that mental instability or intel- 
lectual deterioration appears more in the area 
of reasoning than in the area of remote 
memories (21, pp. 68-69; 22, p. 351; 31, pp. 
67-68). It may be that it appears still more 
in some other areas than it does in reason- 
ing. A lead to these other areas might 
result from an effort to find an area which 


differs from reasoning in the same way that 
reasoning differs from remote remember- 
ing. Following this lead, we here regard 
remembering as the reproduction of ma- 
terial and reasoning as the deduction or 


induction of relationships. In a remember- 
ing test, one passes or fails an item in accord- 
ance with whether or not he reproduces 
material. There is a definite objective cri- 
terion with which the answer can be com- 
pared. In the case of a reasoning test-item, 
a criterion with which the answer can be 
compared may depend exclusively on logic. 
It is a more subjective criterion than is 
operative in the case of remembering. 
According to this argument, reasoning tests 
differ from remembering tests in that they 
may involve less objective criteria. Changes 
in reasoning ability are more reflective of 
mental instability than are changes in mem- 
ory for old material. If we could test with 
still more dependency on subjective criteria, 
we might find expressions in which scatter is 
still more reflective of mental instability. 
Answers to questions on personality inven- 
tories are dependent on subjective criteria in 


the sense that there are no right nor wrong 
answers. It may be that a measure of scatter 
on personality inventories would be highly 
sensitive to individual differences in mental 
stability. 

An omission in the logic just outlined 
pertains to the fact that recent memories do 
not behave in the same manner as old mem- 
ories. Recent memory ability, like reasoning 
ability, seems to decline somewhat with men- 
tal illness. This suggests that scatter scores 
for some types of items might be more effec- 
tive than for others in obtaining significant 
indices on personality inventories. More 
specifically, it may mean that scatter scores 
between old and recent opinions of self are 
the most valid for differentiating between 
mental hospital patients and community 
persons. Recent opinions would probably be 
those based on subjective criteria rather than 
objective criteria. Subjective and objective 
as here used may be contrasted by examples. 
An example of an opinion based on subjec- 
tive criteria is, “I often feel just miserable.” 
One based on objective criteria is, “I have 
many friends.” We here regard as subjective 
that which is observable by the subject only 
and as objective that which is observable by 
others as well as the subject. 

If opinions based on objective criteria do 
not change as freely as those based on sub- 
jective criteria, one would suspect that the 
following scatter scores might be effective in 
differentiating between mental hospital pa- 
tients and community persons: 

1. Scatter scores between objectively and 

subjectively determined opinions. 

2. Scatter scores derived from subjectively 
determined opinions exclusively—pro- 
vided that some opinions have changed, 
resulting in new ones, and others have 
not, thus remaining old opinions. 

We have some research evidence for the 
interpretation just offered. An introductory 
study of the approach herein suggested indi- 
cated that scatter scores derived from sub- 
jective opinions of self are more valid for the 
purpose at hand than those derived from ob- 
jective opinions (17). 

If we accept the hypothesis that there are 
some opinions of self which may be objec- 
tively or subjectively determined, depending 
on the viewpoint of the subject, we can find 
additional theoretical support for the position 
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being outlined. Support is supplied by 


authors who maintain that mental stability 
is enhanced by developing objective rather 
than subjective opinions of self (2, pp. 221- 
246; 24, p. 536). 

One might now raise the question as to 
why recent memories do not behave in the 


same manner as remote ones. A _ possible 
answer is that the mentally ill are more sub- 
jectively oriented with respect to recent 
memories than they are with respect to old 
ones. Old memories were learned when the 
patient was more objectively oriented. 

A further amplification of the above point 
of view is possible. In the case of recent 
memories, the patient may be treating them 
in a manner similar to reasoning. He may 
derive his recent memories as logical deduc- 
tions from distorted subjective opinions. In 
other words, his criteria for recent mem- 
ories, just like those for reasoning, may be 
subjective. On the other hand, his criteria 
for old memories may remain objective. 

There are a number of approaches for 
obtaining measures of scatter on personality 
inventories. Some of these are outlined 
below with a brief discussion concerning 
their appropriateness: 


a. We could select those personality in- 

which have scaled in a 
somewhat analogous to _ intelli- 

items and take a measure of 
scatter on them. Thurstone’s attitude 
scales (30) are representative of this class 
\ limiting fac 


ventories items 
fashion 


gence-test 


of personality inventory. 
tor in the these for 
the problem at hand is the fact that they 
sample only a very limited area of mental 
expressions. Unless great numbers of them 


1, they might not represent un 


appropriateness ol 


were used, 


biased samples of mental expressions. 

b. We might take the discrepancy be- 
tween test and retest on personality inven- 
the scores on two 
personality inventories. A 
limiting condition here, for the purpose at 
hand, from the fact that scatter 
could be equally involved in the two tests 


and still not show up as a discrepancy in 


tories, or between 


equivalent 


results 


total scores. 

way in which we 
two or more 

We could score 

the tests in a manner designed to deter- 


is another 
could the 
administrations of a test. 


Cc. The re 


treat results of 


mine the number of items for which the 
subject changes his answers on successive 
administrations. This method would not 
satisfy our present interest. Diversity, or 
scatter, of self endorsements represents our 
interest. Diversity of self endorsements 
and fluctuation of endorsements may or 
may not be independent of each other. 
We cannot know for certain until we can 
measure each and compare the results. 


A MEasurE OF PERSONALITY INTEGRATION 


Our purpose is to obtain some measure of 
the scatter of self endorsements as an index 
of personality integration. This represents 
a difficult problem. Our method can be bet- 
ter understood if we consider some of the 
difficulties. 

One of the difficulties grows out of certain 
theoretical positions and clinical findings. 
These theoretical positions and clinical find- 
ings indicate that there are diverse ways of 
integrating. In other words, the extent to 
which a particular sample of mental expres- 
sions is an index of integration depends, in 
part, on the ways the expressions have been 
interrelated by the subject. A sample of 
mental expressions may reflect different de- 
grees of integration, depending upon how 
the different subjects have interrelated the 
expressions. A particular sample of mental 
expressions may be a more valid index of 
integration for some subjects than for others. 
In terms of factor theory, each sample of 
mental expressions reflects both group or 
common factor and a specific factor vari- 
ance (29). The ratio of group-factor influence 
to the specific-factor influence presumably 
varies somewhat from individual to individ- 
ual. It also varies some from one sample of 
expressions to another. Different individuals 
give different samples of expressions. 

Several problems in test construction 
emerge as a result of the points just indicated. 
These problems may be outlined as follows: 

a. To determine the average degree of 
integration represented by samples of men- 
tal expressions. 

b. To limit the population for determin- 
ing the average degree of integration be- 
tween expressions to those subjects who 
use the particular expressions being evalu- 
ated. In other words, the indices of inte- 
gration here being outlined are not a 
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function of the population as a whole. The 
successive indices are, instead, functions of 
different subgroups within the population. 
This provision represents at least a partial 
concession to the viewpoint that different 
individuals can interrelate, or integrate, 
the same expressions in different ways. 

c. To find sample expressions in which 
the ratio of group factor to specific and 
error factors is high for a wide range of 
people. 

d. To isolate such a range of sample 
expressions that a relatively large and un- 
biased number of them will be given by 
all subjects for whom the test is designed. 


Let us now outline our approach to these 
problems. We want to obtain samples of 
mental expressions from many subjects, and 
we want to be able to compare the samples 
on the basis of degrees of integration re- 
flected by them. If we allow the subjects to 
talk freely without limits or restrictions, we 
are limited in comparing the several samples 
of mental expressions as to integration be- 
cause they have so little in common from 
one subject to the next Accordingly, we ask 
questions of the subjects. We assume that 
the expressions, as samples from the subjects, 
will be more representative to the extent that 
the answer possibilities are unrestricted, but 
they must not be unrestricted to the extent 
that the statistical labor becomes unreason- 
able. We have decided on three answer pos- 
sibilities to the personality questions, Yes, 
No, and Between, the latter answer possi- 
bility representing an expression © between 
Yes and No. 

Our purpose is to obtain a measure of the 
scatter or diversity of answers which is re- 
lated to individual differences in personality 
integration. Our measure of diversity be- 
tween any two answers is based exclusively 
on responses of those subjects who gave one 
or both of the two answers. As a result, an 
individual’s item scores do not derive from 
a comparison of the individual with a statis- 
tical average of the population, but only with 
those who gave some answers in common 
with his. 

As we evaluate the degree of diversity 
represented by a subject’s successive pairs of 
answers, we compare the subject with dif- 
ferent individuals. It is thought that this 
approach represents an advantage because it 


provides empirical standards of integra’ n 
without assuming that the average of the 
population, or any one component group 
thereof, must serve as the standard ‘or each 
pair of answers. The diversity of each pair 
of answers is in general determined on the 
basis of different categories of people, namel) 
those individuals who gave one or both 
answers of the pair. It follows then that the 
particular answers an individual gives deter- 
mine the category with which he is com- 
pared. Large diversity scores are obtained 
by those individuals whose successive en- 
dorsements are characteristic of divergent 
categories of people. Small diversity scores 
are obtained by those individuals whose suc- 
cessive responses are characteristic of rather 
similar categories. 


EmpiricaL EvipENCE 

If we now find that these diversity scores 
are related to individual differences in per- 
sonality integration, they will assist us in 
suggesting an operational definition of per- 
sonality integration. One way to investigate 
whether or not these diversity scores are 
sensitive to individual differences in person- 
ality integration is to administer the test to 
two groups of subjects who differ in per- 
sonality integration and highly related vari- 
ables cnly. This method represents the 
approach attempted in this study, the two 
groups of subjects being male mental hos- 
pital patients and male community persons. 
An effort was made to equate the groups on 
variables other than mental health status, and 
those related to it, by including in each group 
only persons who had had some college 
training. The details of the investigation 
are reported elsewhere (18). The important 
fact for our present consideration is that the 
diversity scores differentiated between the 
groups with a critical ratio of 9.97. There 
were 84 subjects in the group of community 
persons and 130 subjects in the group of men- 
tal hospital patients. Fifty-seven per cent of 
the hospital patients made a score higher 
than plus two standard deviations for the 
community persons. This degree of differ- 
entiation was obtained without application 
of statistical criteria for improvement of the 
test through item selection. Two statistical 
criteria for improvement of the test have been 
outlined and shown to be effective (16, 17). 
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The community persons used as subjects in 
this study are the ones on whom the diversity 
test was standardized, and it is planned to 
check the differentiation by application of 
the test to other groups. 


AN OPERATIONAL DEFINITION OF PERSONALITY 
INTEGRATION 


If the results just reported are accepted as 
evidence that the diversity scores are sensitive 
to individual differences in personality inte 
gration, we may use this evidence in conjunc- 
tion with our earlier description of those who 
make high and low diversity scores and offer 
a tentative definition of personality integra- 
tion. A disintegrated personality is one who 
in a personality test situation gives successive 

that are characteristic of diverse 
categories of people; an integrated person- 
ality is one who gives successive responses 
that are characteristic of rather similar cate- 
gories of people. Categories are simiiar or 
diverse depending on the extent to which 
they have common members. 

The tentative definition of personality inte- 
gration that we here offered is restricted to 
the test situation. It is suggestive of wider 
definitions. One such wider definition is as 
follows: The disintegrated person is one 
whose successive reactions are characteristic 
of highly diverse social groups; the integrated 
personality is one whose successive reactions 
are characteristic of rather similar social 
groups. Definitions such as this one can 
serve as hypotheses for further research into 
the nature of personality integration. 
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THE EFFECTS OF TWO DEGREES OF FAILURE ON LEVEL OF 
ASPIRATION AND PERFORMANCE *? 


BY IRA M. STEISEL anp BERTRAM D. COHEN 4 


University of lowa 


NUMBER of studies by Lewin and his 
A students (2, 3) have been concerned 

with the effects of failure or success 
upon the goal-orientation of the performing 
organism. They devised a method for deter- 
mining human goals in certain situations— 
the widely known level of aspiration tech- 
nique. 

Festinger (2) and Jucknat (quoted in 3) 
have shown that levels of aspiration tend to 
shift downward after non-attainment of a 
self-announced criterion of performance and, 
conversely, upward after the attainment of 
the S’s goal. Such findings have been related 
to underlying “feelings of success or failure” 
as inferred from introspective reports (or 
other reactions) of the S. Jucknat’s definition 
of failure, for example, was stated in terms 
of E’s judgment of S’s feeling of failure; and 
the degree of success or failure was also de- 
fined on the basis of judgments by E of the 
intensity of such subjective experiences. It 
was found that the strength of the tendency 
to shift the level of aspiration upward or 
downward after success or failure, respec- 
tively, was a direct function of the intensity of 
the S’s feelings as judged by E. 

While these relationships are certainly rea- 
sonable, it seems fair to observe that a defini- 
tion of intensity of success or failure which 
depends upon S’s reactions to an event does 
not readily allow for the experimental manip- 
ulation of these variables. Indeed, since sub- 
jective failure and success are inferences based 
upon S’s behavior, it is possible that their 
identification is not entirely independent of 
the specific responses they are intended to 
explain. 

1 This experiment is one of a series of studies in per- 
sonality being done under the direction of Dr. 1. E. 
Farber at the State University of Iowa. The authors 
would like to express their appreciation of his assistance. 

2 This paper was presented in part at the April, 1949, 
meetings of the Midwestern Psychological Association. 


8 Now at the Seattle Guidance Clinic and Indiana 
University, respectively. 


PRoBLEM 


The purpose of the present study was to 
determine whether differences in degree of 
failure defined not in terms of S’s reactions 
but in terms of independently controlled ante- 
cedent events, could be shown to have any 
reliable effect upon his responses. Accord- 
ingly, failure was simply defined as non-at- 
tainment of an S’s indicated level of aspira- 
tion without regard for how he felt about it. 
Degree of failure was defined in terms of the 
extent to which a given level of aspiration 
exceeded the subsequent performance. The 
size of this difference, along with certain 
standardized statements by E, constituted our 
definition of degree of failure. A subsidiary 
problem was to investigate the effects of fail- 
ure on a non-verbal measure of motivation 
defined in terms of changes in speed of per- 
formance immediately following a failure 
experience. 


PROCEDURE 


Forty undergraduate students in the intro- 
ductory psychology course at the State Uni- 
versity of Iowa served as Ss. They were as- 
signed to either one of the two experimental 
conditions: 20 in a mild-failure and 20 in a 
severe-failure group. The task employed in- 
volved the solution of several series of simple 
arithmetic problems. Each series was pre- 
sented on a separate sheet and the Ss were 
told that a standard length of time (40 sec.) 
would be allowed for each trial. The actual 
time allotted was varied by E, without S’s 
knowledge, in order to control success and 
failure. Scoring was in terms of the number 
of problems completed on each trial. After 
an initial practice trial, each S was required 
to state a level of aspiration for each subse- 
quent trial. There were 12 trials in all, of 
which the third, sixth, and ninth involved 
failure; i.e., non-attainment of the expressed 
level of aspiration. The third trial involved 








80 
the same (moderate) failure for both groups, 
each S being stopped five to six problems 
prior to his expressed goal by the announce- 
ment that the time was up. The sixth and 
ninth trials, however, involved differential 
failure for the two groups. Ss under the mild- 
failure condition were stopped just one to two 
problems short of their goals; whereas Ss 
under the severe-failure condition were 
stopped at a point 10 to 12 problems short of 
their expressed goals. Ss were allowed to 
reach their levels of aspiration on all but these 
three trials. On the success trials, perform- 
ance was halted as soon as the level of aspira- 
tion was reached, so that no differential suc- 
cesses occurred. 

Following each success trial, E said, “Very 
good. You made it. How many do you 
expect to make next time?” After the mod- 
erate failure trial, E said, “I guess you didn’t 
make it. You made (five) problems less than 
you expected. How many do you expect to 
make next time?” Following each severe 
failure E said, “That’s a very poor showing. 
You missed by a pretty sad margin. You only 
got —— that time. How many do you expect 
to get next time?” Following the mild fail- 
ure experience, E said, “Too bad, you almost 
made it. How many do you expect to get 
next time?” 


RESULTS 


The direction and amount of change in the 
mean levels of aspiration of the two groups 
as a function of failure is indicated in Figure 
1. It will be seen that the first (moderate) 
failure, which was identical for all Ss, re- 
sulted in similar shifts in level of aspiration 
by the members of both groups. Following 
the differential failure on trial 6, although 
level of aspiration is reduced under both con- 
ditions, it may be seen that the shift down- 
ward is greater following severe failure. The 
mean of the difference between these shifts 
is significant well beyond the 1 per cent level 
of confidence, the ¢ being 4.14 for 19 degrees 
of freedom. The final differential failure 
again resulted in differential shifts. This 
mean difference is also significant beyond the 
1 per cent level of confidence, the ¢ being 
3.15. 

It should be noted that the amount of 
change produced by successive failures in the 
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severe-group tended to decrease. The same 
trend appeared for the mild-group. As a 
matter of fact, in the latter group, the data 
indicate that following the final failure the 
Ss tended to shift their levels of aspiration 
slightly upwards instead of in the conven- 
tional downward direction. 

In addition to the analysis of shifts in level 
of aspiration, the difference between each 
level of aspiration score and the immediately 
preceding performance score (the number of 
problems actually completed) was calculated 
for each S for each trial. The means of these 


“discrepancy scores” for all trials under both 


“— MILO FAILURE 
+--+ SEVERE FAILURE 


NUMBER OF PROBLEMS 


6-5 
TRIALS 





Suirts in Lever oF AsPirRaATION FoLiow- 
FoR Two Decrees oF FAILURE 


Fic. 1. 
ING FAILure 
Each point plotted represents the difference be- 
tween the mean level of aspiration following a 
failure trial (trial 3, 6, or 9) and that following 
its immediately preceding success trial (trial 2, 5, 
or 8). The points are plotted separately for each 
group (see legend). 


conditions are plotted in Figure 2. Immedi- 
ately following differential failure trials the 
severe-failure group evidences a positive rise 
in the mean discrepancy score, which is 
markedly different from that for any other 
trial of that group and for all trials of the 
mild-failure group. The scores of the mild- 
failure group remain relatively stable through- 
out, with inter-trial fluctuations tending to 
decrease with successive trials. The differ- 
ence between the discrepancy score means for 
the two groups following the differential 
failure trials are significant well beyond the 
1 per cent level of confidence (#’s of 11.52 and 
12.67, respectively). 

Further inspection of Figure 2 suggests that 
the discrepancy scores for the severe group 
are larger following the last differential fail- 
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ure. The difference (¢ of 2.12) is significant 
at the 5 per cent level of confidence. This 
trend is consistent with the aforementioned 
tendency for the level of aspiration scores to 
be less affected by failure with successive 
failure trials. 

Although previous authors have given con- 
siderable attention to the effects of failure 
upon the level of aspiration, there has been 
little concern with these effects upon actual 
performance. We have seen that failure, ob- 
jectively defined, influences motivation, at 


wo 


~——~ MILD FAILURE 


»---+ SEVERE FAILURE 


NUMBER OF PROBLEMS 


~ Nowe vyoawn @ 








Fic. 2. DiscREPANCIES BETWEEN LeveL oF AsPIRA- 
TION AND PRECEDING PERFORMANCE 

The points plotted show the “discrepancy score” 
means for each group for each trial. A single dis- 
crepancy score for an S was computed by subtract- 
ing his level of aspiration score (the number of 
problems he expects to complete) for a given trial 
from the immediately preceding performance score 
(the number of problems actually completed on the 
trial just finished). 


least in its verbalized form. We may also 
inquire whether failure influences perform- 
ance in such situations. To answer this ques- 
tion the time actually required to complete 
the first 20 problems was obtained for each S 
for every trial. Unfortunately, the groups 
were not initially equated with respect to this 
variable, so that their speeds differed, even 
before the differential experimental condi- 
tions were introduced. The results, there- 
fore, should be interpreted with caution. 
They are sufficiently suggestive, however, to 
be worth consideration. 

In order to compare the speed of perform- 
ance following failure with that following 
success, the mean time score for a post-failure 
trial was compared with the combined mean 
for the adjacent two post-success trials (the 
trial immediately preceding and the trial im- 


mediately following the post-failure trial). 
These data are presented in Table 1. The 
values reported in the first two columns of 
this table are the means and standard devia- 
tions for the first and second differential post- 
failure trials (trials 7 and 10) for each group. 
The third and fourth columns present the 
corresponding measures for the combined 
adjacent post-success trials (trials 6 and 8, 
and trials g and 11) for both groups. 


TABLE 1 


Errect OF FAILuRE ON SPEED OF PERFORMANCE 
(Time in Seconds for Twenty Problems) 


Post-FaILure 
TrRiat 


ADJACENT SuccEss 
TRIALS 


M dD. ] S.D. 





1st Severe 
2nd Severe 


1st Mild 
2nd Mild 


20.45 
18.55 


22.05 
20.90 

















It will be noted that in three of the four 
comparisons there was an increase in speed 
on the post-failure trials. The differences, 
however, were insignificant except in the 
case of the second severe-failure versus its ad- 
jacent trials mean (Table 1, row 2). The 
t for this difference is 4.59 which is sig- 
nificant well beyond the 1 per cent level. 

Combining the means for both of these 
post-failure trials for the severe-failure group 
and comparing this with the combined mean 
for all four adjacent success trials for this 
group, yields a ¢ of 3.50, which is significant 
beyond the 1 per cent level of confidence. A 
comparable analysis of the effect of failure 
as compared with success in the mild-failure 
group did not reveal a statistically significant 
difference. 

To determine whether failure has a cumula- 
tive effect upon the tendency to increase speed 
of performance, the mean difference between 
the first post-failure trial and its adjacent suc- 
cess trials was compared with the correspond- 
ing difference for the second differential 
failure. For the severe group the mean dif- 
ference between these differences was 1.57 
seconds; the ¢ was 3.10, which is significant 
beyond the 1 per cent level of confidence. 
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For the mild group a similar analysis revealed 
no significant difference. 

These data indicate that an immediate re- 
action to severe failure was an increase in 
speed of performance. Furthermore, this re- 
lationship became more pronounced with suc- 


cessive failures. 
Discussion 


The results of the present study confirm 
those of past investigations in that, in gen- 
eral, failure experiences were followed by the 
lowering of aspiration levels, and more severe 
failure resulted in a greater drop than did 
mild failure. Thus, objective failure was 
found to operate in a fashion similar to that 
reported for subjective failure, at least insofar 
as these particular relationships are con- 
cerned. 

While the level of aspiration decreased 
more markedly following severe-failure, the 
discrepancy scores for the Ss under this con- 
dition were extremely high and _ positive. 
Also, this discrepancy was found to be sig- 
nificantly greater after the second than after 
the first differential failure. It would seem, 
then, that although the Ss took some account 
of the severity of their failures, they did not 
reduce their levels of aspiration proportion- 
ately. Furthermore, this tendency to disre- 
gard the objective degree of failure increased 
with successive failures. 

We are inclined to attribute this growing 
ineffectiveness of failure in changing the level 
of aspiration to the interpolation of success 
trials between the failures. One might hy- 
pothesize that an “achievement expectancy” 
(or set) is built up as a function of the num- 
ber of previous successes (and also, probably, 
as a function of the pattern of successes and 
failures). Level of aspiration may depend 
on such an achievement expectancy as well as 
on the immediately preceding performance. 
The stronger the achievement expectancy, the 
greater would be its relative effect on the level 


of aspiration. Thus, following a series of 
successes even a severe failure might not effect 
any considerable drop in level of aspiration, 
in which case a high discrepancy score would 
result. 

Our data indicate that the achievement ex- 
pectancy rapidly acquired sufficient strength 
to compete successfully with the opposing 
tendencies generated by the immediate per- 
formance scores. 

The increase in speed of performance fol- 
lowing severe failure is congruent with the 
hypothesis that failure may have motivating 
effects upon performance (1, 4). This moti- 
vating aspect of failure is, we believe, one 
which may be obscured or at least ignored in 
the typical level of aspiration study. Our 
results showed that, with successive failures, 
there was a diminishing effect of failure on 
the level of aspiration, while the data for the 
performance measure indicated the reverse 
trend. In view of this disagreement between 
the measure of motivation as inferred from 
S’s statements (shifts in level of aspiration) 
and that derived from his performance (in- 
creases in speed), it seems reasonable to pro- 
pose that the conventional verbal measures of 
level of aspiration do not reflect some sig- 
nificant aspects of the motivational systems 
studied. 
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PREJUDICE, CONCRETENESS OF THINKING, AND 
REIFICATION OF THINKING 


BY MILTON ROKEACH 
Michigan State College 


Hat high-prejudiced persons are gen- 
erally more rigid than low-prejudiced 
persons has been demonstrated both ex- 
perimentally and clinically. Studies of the 
former type which may be cited here are those 
of Rokeach (19, 20) and Christie (2). Studies 
of the latter type are those of. Frenkel-Bruns- 
wik (4, 5, 6), Frenkel-Brunswik, Levinson, 
and Sanford (7), Frenkel-Brunswik and San- 
ford (8), Reichard (18), and Levy (15). 

In all these studies measurements and eval- 
uations of rigidity were determined by prob- 
lems (arithmetical and spatial) or materials 
(case studies and Rorschach) deliberately di- 
vorced in content from the main social atti- 
tude under consideration. 

One problem which therefore suggests itself 
for further research is whether it is objectively 
possible to speak of the rigidity of a social 
attitude as such. From the research findings 
cited above one is easily tempted to expect 


that a highly-prejudiced person’s thinking 
about social groups as such is characterized by 
greater rigidity than is a less-prejudiced per- 


son’s thinking about social groups. That 
such is the case, however, has not been ob- 
jectively demonstrated heretofore except by 
indirect inferences about underlying person- 
ality structure. It is the purpose of this study 
to attempt a more direct demonstration of 
this hypothesis. 

Two commonly observed facts, however, 
appear to contradict this hypothesis. First, 
it has been frequently observed that it is 
about as difficult to alter tolerant attitudes 
in the direction of intolerance as it is to alter 
intolerant attitudes in the direction of toler- 
ance. Related to this is the frequently-held 
view that attitudes at both extremes involve 
rigid thinking, the only difference being that 
while the prejudiced person’s stereotypes 
more frequently have to do with such groups 
as Negroes and Jews, the less-prejudiced per- 

1In a personal communication, Dr. David M. Levy 


reports that high-prejudiced persons manifest greater 
-coarctation on the Rorschach than low-prejudiced persons. 
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son’s stereotypes more frequently have to do 
with such groups as capitalists and fascists. 
Second, it has also been observed that some 
extremely leftist individuals are quite dog- 
matic or doctrinaire in their social-poiitical 
ideologies. In both sets of instances the im- 
pression is strong that both extremes of the 
attitude continuum are equally resistant to 
change. 

Perhaps resistance to change per se is not 
necessarily a criterion of rigidity. A glance 
at the writer’s operational definition of rigid- 
ity® will reveal that the crucial point is 
whether there is resistance to change when 
the objective conditions demand it. Unfortu- 
nately, however, it is not easy to apply this 
criterion to determine the rigidity of a social 
attitude. It would be extremely difficult, if 
not impossible, to ascertain scientifically, for 
example, whether a prejudiced person’s solu- 
tion of a given social problem represents a 
greater or lesser inability to select the more 
efficient of two (or more) alternative solutions 
to a problem. 

A more promising criterion of the rigidity 
of a social attitude is suggested by the ob- 
served relationship between rigidity and 
concrete thinking. Data we have presented 
elsewhere (19, 20) indicate not only that per- 
sons high in prejudice are more rigid in 
solving problems but also that they are more 
concrete in their problem-solving attempts. 
We have also shown (21) that rigid problem- 
solvers (disregarding the prejudice variable) 
are significantly more concrete than less-rigid 
problem-solvers. This is in line with findings 
in psychopathology that rigidity and con- 
creteness of thought are generally found to- 
gether (3, 9, 10, 11, 12, 16, 25-29). In yet 
another paper (22) we have shown that in- 
creases in the time available to perceive prob- 
lems effects a reduction in both rigidity and 

2 Elsewhere the writer has defined rigidity “as the 
inability to change one’s set when the objective conditions 
demand it, as the inability to restructure a field in which 


there are alternative solutions to a problem in order to 
solve that problem more efficiently” (20, p. 260). 
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concreteness of thinking, again indicating the 
intimate relationship between these two vari- 
ables. And, finally, we have suggested else- 
where (19, 22) that behavioral rigidity is a 
necessary consequence of concrete thinking 
and that both concreteness and rigidity may 
be means of self-defense against threat (see 
also 1 and 2). 

This crucial relationship between rigidity 
and concrete thinking led us to shift the focus 
of our inquiry from rigidity to concrete think- 
ing. Assuming on the basis of empirical find- 
ings that the latter is one criterion of the 
former, and bearing in mind the earlier find- 
ings of the relationship between prejudice 
and rigidity, we arrived at the hypothesis that 
ethnocentric thinking should be more con- 
crete, more object-bound than less-ethnocen- 
tric thinking.* Stated more specifically, we 
hypothesized that the high-prejudiced person's 
thinking about given groups should be more 
frequently rooted in the concrete individual 
objects comprising such groups, while the 
low-prejudiced person’s thinking should be 
more frequently in terms of the abstract prin- 
ciples for which the given groups stand. 

If our hypothesis is correct, then it is rea- 
sonable to expect that the intolerant person’s 
social attitudes should be more frequently 
organized in terms of such concrete social 
objects as Catholics, Protestants, Fascists, 
Communists, etc., while the tolerant person’s 
attitudes should more frequently be organized 
in terms of such abstractions as Catholicism, 
Protestantism, Fascism, Communism, etc. 
One way to test this hypothesis is simply to 
ask several groups differing in degree of 
prejudice to define such concepts as Catholt- 
cism, Protestantism, etc., and from an ob- 
jective analysis of these definitions deter- 
mine whether these groups differ in their 
modes of thought. 


A Pretiminary Stupy 


A preliminary study was first conducted, 
the main purpose being to set up categories 
for evaluating definitions. The subjects were 
students in the writer’s class in Personality 
research is one of two 
the underlying cognitive 
processes distinguishing ethnocentric from less-ethnocen 


tric attitudes. | will present an- 
other and, perhaps, a more comprehensive formulation of 


8 The present analysis and 


attempts designed to discover 


Isewhere (23, 24) we 


the underlying cognitive differences. 
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at Michigan State College. They were asked 
to define briefly, in writing, 10 concepts. Five 
were religious concepts; five were political- 
economic concepts. The religious concepts 
were Atheism, Catholicism, Christianity, 
Protestantism, and Judaism. The political- 
economic concepts were Capitalism, Com- 
munism, Democracy, Fascism, and Socialism. 

The definitions proved more difficult to 
categorize than had been anticipated. Many 
of the definitions did not seem to fall neatly 
into one or another category. Some defini- 
tions, while abstract, seemed to represent 
different levels of abstractness (e.g., a belief 
vs. a system of beliefs). Other definitions 
were reified abstractions. Qualitatively dif- 
ferent from the latter were other definitions 
which started out abstractly but ended con- 
cretely. Other definitions were clearly con- 
crete. Still others were hardly classifiable at 
all—the subjects merely listed one or more 
elements, related or unrelated, which were 
supposedly characteristic of the concept in 
question (e.g., Capitalism = “free enterprise, 
freedom”). Cutting across all of the above 
were evaluative definitions (e.g., Catholicism 
= “a narrow-minded kind of religion”; De- 
mocracy = “the best kind of government”). 

Finally, the definitions differed from each 
other in factual correctness. Since we were 
primarily interested in modes of thought, we 
ignored factual correctness as such. 

After much trial and error we finally 
emerged with the following four categories: 

1. Abstract Definitions. Definitions were 
classified as abstract if they took one of the 
following general forms: a government in 
which , a form of government in 
which ..., a religion in which ...,a form 
of religion in which , a beliefin...,a 
system of beliefs about ..., a theory about 

, a doctrine in which ...,.a philosophy 

., a way of life...,a doctrine... , the 
worship of ..., a dictatorship. ... 

2. Concrete Definitions. A definition was 
categorized as concrete if the concept was de- 
fined or explained in terms of a person or 
group holding to a belief, or in terms of a 
person (or persons) being a member of a 
church or religion, etc. or if the concept 
was otherwise defined, implicitly or ex 
plicitly, in terms of the subject himself o1 
another person or persons. Definitions of this 
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sort took the following general forms: one 
who believes in ...,a believer in ...,a fol- 
lower of ..., believing in ..., a group that 
(who) believes in... . Other examples, not 
following these general forms but neverthe- 
less indicating that the explanation was 
anchored onto the subject himself or other 
persons, are: “Everyone on an equal basis” 
(Communism); “Christ, the son of God has 
been on earth and is our savior” (Chris- 
tianity); “You get out what you put in” 
(Communism) ; “The right of the individual 
to make as much money as he is capable” 
(Capitalism); “What the average white per- 
son is” (Christianity). 

3. Reification Definitions. A third type of 
definition which emerged with some fre- 
quency and which was not anticipated prior 
to this preliminary study was reification. 
For purposes of this study reification may be 
defined as the personification or the con- 
cretization of the abstraction itself. The 
following examples should illustrate this: 
“The practicing of ideals set up by the 
church who obtained their creed and beliefs 
from the teachings of Jesus” (Christianity) ; 
“A type of government trying to rule for the 
good of the people and not succeeding too 
well” (Socialism); “A group of religions 
believing in Jesus Christ” (Christianity) ; 
“Religions that worship God .. .” (Protes- 
tantism); “A religion started in Rome” 
(Catholicism). 

4. Miscellaneous Definitions. A large pro- 
portion of the definitions could not reason- 
ably be classified into any of the above 
categories. For lack of a better term we 
called these miscellaneous definitions. Most 
of these were unclassifiable primarily because 
they were in terms of isolated characteristics 
or were fragmentary in nature. Some ex- 
amples: “Free competition, free enterprise” 
(Capitalism); “Community control of prop- 
erty” (Communism); “State supreme. Indi- 
vidual is nothing, state is everything” 
(Fascism); “The individual is subject to the 
state” (Communism); “Autocratic, militaris- 
tic” (Fascism); “Life under a dictator” (Fas- 
cism); “Money rules all” (Capitalism); 
“Few rule majority” (Socialism). 

Other definitions categorized as miscel- 
laneous were those starting out abstractly 
but ending concretely (or vice versa). Ex- 


amples of such definitions are: “Oriental 
religions, believe in one God” (Judaism); 
“Belief that money is the most important get 
ahead at any cost” (Capitalism); “Believers 
in God—A religion” (Christianity); “Form 
of religion each class of people believe in” 
(Christianity). 

Finally, a small number of definitions (less 
than 2 per cent) were not otherwise classi- 
fiable because the subject gave no definition 
or said he didn’t know. 


Tue Matin Stupy 
Subjects and Procedure 


The subjects were male and female fresh- 
men at Michigan State College taking courses 
in Written and Spoken English. The gen- 
eral procedure was to administer to groups 
of 15 to 20 subjects at a time the 10-item 
Ethnocentrism Scale constructed by Levinson 
for the Public Opinion Study in Berkeley (7, 
13, 14). Immediately afterwards the subjects 
were instructed as follows: 


On the paper before you are a number of terms 
used very frequently nowadays. I am interested in 
finding out whether differences in opinions are due 
to the fact that different people do not mean the 
same thing when they use these terms. Would 
you please write down next to each term what you 
understand to be its meaning. Don’t spend more 
than a minute or so on each term. You don’t have 
to worry about being too precise in your definitions. 
Just let me know as briefly as possible what you 
understand to be the general meaning of each of 
these terms. It is important that you define every 
term. Go ahead. 


On a separate sheet the following ten con- 
cepts, five religious and five political-eco- 
nomic, were listed alphabetically: Buddhism,® 
Capitalism, Catholicism, Christianity, Com- 
munism, Democracy, Fascism, Judaism, Prot- 
estantism, Soctalism. 

Following the procedure described above, 
data were secured for 144 native white, non- 
Jewish subjects. All others were eliminated 


4 The writer is grateful to Dr. Frederick E. Reeve, Dr. 
Charles G. Fulkerson, Miss Maude Shapiro, Mr. Charles 
F. Hampton, and Mr. Herbert L. Hackett of the Depart- 
ment of Written and Spoken English for their coopera- 
tion in making subjects available for this study. 

5 These 10 concepts were the same as those appear- 
ing in the preliminary study with the exception that 
Buddhism was substituted for Atheism. This substitution 
was made following the discovery that Atheism, having 
to do primarily with absence of belief in God, was 
adequately defined by almost all subjects. 
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from the study. Each subject’s protocol was 
assigned a random identification number, 
and it was thus possible to categorize the 
definitions without knowing the subject’s 
prejudice score. 
Reliability of Categorizations of Definitions 
Reliability was determined by having two 
judges® independently categorize a sample 


of 400 definitions obtained from 40 subjects. 
For all 400 definitions, there was 67.5 per 


cent agreement between the two judges. 


as follows: To all the 10 concepts the mean 
number of abstract definitions given by 
Quartiles I, II, III, and IV are 4.47, 4.25, 4.25, 
and 4.14, respectively. Comparable means 
for the reification definitions are 1.06, 0.64, 
0.67, and 0.58, respectively. Comparable means 
for the concrete definitions are 1.58, 2.50, 2.30, 
and 2.47, respectively. Finally, comparable 
means for the miscellaneous definitions are 
2.89, 2.61, 2.78, and 2.81, respectively. 

A study of the results obtained for Quar- 
tile I (lowest prejudice quartile) shows that 


TABLE 1 


Derinitions GivEN To Ten Concepts By Preyupice Quartites I, II, Ill, anp IV 


QUARTILE ABSTRACT 


I. Very Low 4-47 
N = 36 1.93 

32 

Il. Low ~25 
N = 36 .20 


Ill. High 
N = 36 


IV. Very High 
N = 36 


IL11V Combined 
N = 108 


C.R.wi-u, ut, 1 





While this represents a relatively unsatistac- 
tory degree of reliability for individual pre- 
diction, it turned out to be adequate enough 
to reveal the existence of significant differ- 
ences between different prejudice groups. 


Results 

The 144 subjects were divided on the basis 
of their prejudice scores into four quartile 
groups of 36 subjects each. The Ss in each 
quartile were then compared for frequency 
of abstract, concrete, reification, and miscel- 
laneous definitions. Table 1 presents these 
comparisons for all the 10 concepts taken 
together. 

The results shown in Table 1 may be read 


® The writer wishes to express his appreciation to Mrs. 
Bryna Graff for having served as the second judge. 


REIFICATION 


CATEGORY 
CONCRETE MIscELLANEOUS 
1.55 
1.39 


23 


-50 
-I5 
-36 


-30 
-04 
-34 
-47 
33 
-39 -33 


-43 +73 
15 
-21 -18 


-53 





the pattern of definitions given to all 10 con- 
cepts is clearly different from those given by 
Quartiles II, III, and IV. Moreover, the 
means obtained for Quartiles II, III, and IV 
for all types of definitions are seen to be 
quite similar to each other. 

While the subjects falling within the lowest 
quartile do not appear to differ appreciably 
from those falling within the other three 
quartiles with respect to the mean number 
of miscellaneous definitions, it will be seen 
that Quartile I gives somewhat more abstract 
definitions, more reification definitions, and 
less concrete definitions than Quartiles II, III, 
and IV. 

Significance of the differences were ob- 
tained by first combining the results obtained 
for Quartiles II, III, and IV and then com- 
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paring these combined means with those of 
Quartile I. These results are also shown in 
Table 1. 

It is seen that Quartile I differs most 
clearly from Quartiles II, III, and IV in mean 
number of concrete definitions. The mean 
for the former is 1.58 concrete definitions and 
for the latter, 2.43 concrete definitions. The 
difference in means is significant at the 1 per 
cent level of confidence (C.R.=2.73). 

With regard to the difference in abstract 
definitions, the results show that while Quar- 
tile I gives more abstract definitions than 
Quartiles II, III, and IV, this difference is 
not statistically significant. Similarly, Quar- 
tile I gives more reification definitions than 
Quartiles II, III, and IV. This difference 
reaches significance at the 10 per cent confi- 
dence level (C.R.=1.45). 

Reification, however, is a form of abstrac- 
tion and, therefore, it should be of interest 
to combine the abstract and reification defi- 
nitions into a single score and determine the 
significance of difference between the groups 
for such scores. The means for Quartile I 
and Quartiles II, III, and IV are 5.53 and 
4.84, respectively. The sigmas are 1.93 and 
2.66, respectively. The difference in means 
is significant at the 5 per cent confidence 
level (C.R.=1.67). 

The data shown in Table 1 were further 
broken down for the five religious and the 
five political-economic concepts separately.’ 
These results are highly similar to those 
shown for all the 10 concepts taken together 
(Table 1). For the five religious concepts 
the pattern of definitions given by Quartile I 
is different from those given by Quartiles II, 
III, and IV and, furthermore, the results for 
Quartiles II, II], and IV are quite similar to 
each other. The differences in mean number 
of concrete and reification definitions be- 
tween Quartile I, on the one hand, and 
Quartiles II, III, and IV, on the other, are 
both significant at the 5 per cent confidence 
level. The difference in mean number of 
abstract definitions, while in the expected 
direction, is again small and not statistically 


7 To obtain this data, order Document 2747 from the 
American Documentation Institute, 1719 N St., N.W., 
Washington 6, D. C., remitting $0.50 for microfilm 
(images 1 inch high on standard 35 mm. motion pic- 
ture film) or $1.80 for photocopies (6x8 inches) 
readable without optical aid. 


significant. When the abstract and reifica- 
tion definitions are combined to yield a 
single score, the means for Quartiles I and 
Quartiles II, III, and IV are significantly 
different from each other at the 5 per cent 
confidence level. 

The differences found for the five political- 
economic concepts are also consistent with 
the preceding—that is, the subjects in Quar- 
tile I give more abstract definitions, more 
reification definitions, and less concrete defi- 
nitions to the five political-economic concepts 
than do the subjects in Quartiles II, III, 
and IV. None of these differences, however, 
reaches statistical significance. 

The question may now be raised, as it was 
raised in our earlier study (19, 20), as to 
whether the results reported here are a func- 
tion of intelligence. Two investigators (17, 
30) have reported the existence of a negative 
relationship between prejudice and _intelli- 
gence, and, therefore, it is important to 
determine to what extent group differences 
in the pattern of definitions found are related 
to intelligence. 

Intelligence scores, as measured by the 
American Council on Education Test 
(A.C.E.), were available in decile form for 
all 144 subjects. The product-moment cor- 
relation between prejudice and intelligence 
was found to be —.28, which is significant at 
the 1 per cent level and which is consistent 
with the results found by other investigators. 
There is, then, the possibility that the pattern 
of definitions given is related to intelligence. 

To study this further, the mean of the total 
A.C.E. scores and the standard deviation 
were computed for each of the four quartiles 
separately. These results, as well as the 
standard errors, are shown in Table 2. The 
critical ratios are given in Table 3. 


TABLE 2 


Means, STANDARD DeviaTIONs, AND STANDARD 
Errors oF Totat A.C.E. Scores For 
Quartites I, II, III ann IV 








QuaRTILE 





Ill 





36 
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It is quite clear from Table 2 that an 
inverse relationship exists between prejudice 
and intelligence. All the differences among 
quartiles, except that between Quartiles II 
and III, are significant at the 5 per cent 
confidence level or beyond. 

It will be recalled that the pattern of defi- 
nitions given by Quartile I to all 10 concepts 
differed significantly in certain important 


TABLE 3 


Critica, Ratios aMoNG Mean A.C.E. Scores 


For Quartizes I, II, III, ano IV 


QuaARTILE II Ill 








s.G7° 2.39° 


.30 





cent level. 
cent level. 


respects from those given by Quartiles II, III, 
and IV and, furthermore, that the pattern 
of definitions found for Quartiles II, III, and 
IV were quite similar to each other. If these 
results were a function of intelligence then 
it should be expected that the differences in 
intelligence among the four quartiles should 
also follow a similar pattern—that is, Quar- 
tile I should manifest a significantly higher 
mean A.C.E. score than Quartiles II, III, and 
IV, and, furthermore, Quartiles II, III, and 
IV should be relatively alike in mean A.C.E. 
Tables 2 and 3 indicate that this 
expectation is not borne out. While Quar- 
tile I is significantly different in mean A.C.E. 
score from Quartiles II, III, and IV, Quar- 
tiles II and III are also significantly different 
from Quartile TV. It must be concluded, 
therefore, that the differences in pattern of 
definitions between Quartile I, on the one 
hand, and Quartiles II, III, and IV, on the 
other hand, are not clearly a function of dif- 
ferences in intelligence among these groups. 


scores. 


Discussion 


How do the results found bear on the orig- 
inal In general, the results 
appear to confirm this hypothesis—-that is, 
those extremely low in prejudice will mani- 
fest less concreteness and more abstractness 
than those higher in prejudice. The results, 


hypothesis ? 


however, seem to deviate from the theoretical 
expectations in two important respects: 

1. Instead of finding stepwise differences 
for the four prejudice quartiles it was found 
(a) that the pattern of definitions given by 
Quartile I is clearly different from those 
given by Quartiles II, III, and IV and (b) 
that the patterns of definitions given by 
Quartiles II, II, and IV are similar to 
each other. These findings are clearly at 
odds with the ethnocentrism-rigidity-concrete 
thinking results found earlier (19, 20). While 
there were occasional exceptions, in general, 
a breakdown of the subjects into prejudice 
quartiles revealed stepwise increases in rigid 
and concrete solutions with increases in 
prejudice.® 

The present results suggest, then, that 
except for those scoring extremely low in 
prejudice (Quartile I), the organization of 
social attitudes, at least with respect to the 
modes of thought presently under investi- 
gation, are similar for most people (Quar- 
tiles II, III, and IV). Why this should be 
so is, at the present, unclear to the writer. 
Further research, both clinical and experi- 
mental, will be necessary to answer this 
question. 

2. The results show substantial differences 
in frequency of concrete definitions between 
Quartile I and Quartiles II, III, and IV. 
Comparable differences in frequency of ab- 
stract definitions, however, are relatively 
small. This rather paradoxical finding may 
be partially explained by the further finding 
that the subjects falling within Quartile I 
gave substantially more reification definitions 
than did the subjects falling within the re- 
maining three quartiles. Reification, while 
qualitatively different from both concreteness 
of thinking on the one hand and “good” 
abstraction on the other hand, may neverthe- 
less be considered, by definition, as a form 
of abstract thinking. The reification process 
seems to involve some sort of intellectual 
groping to comprehend the full import of 
the principles involved (e.g., Catholicism, 
Judaism),® but for reasons as yet undeter- 
mined this goal is not fully reached and, 

® The data for quartile groups are not published. 

® An analogy: Students of psychology frequently reify 
such psychological concepts as id, ego, and super-ego, 


indicating an attempt, not completely successful, to com- 
prehend these concepts. 
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instead, the abstraction itself is reified. This 
may be contrasted with concreteness of defi- 
nitions which seems to involve a dynamic 
rejection of intellectual groping. 

Considering further the differences found 
on concreteness of definitions, the results are 
in line with and illuminate further certain 
findings by Frenkel-Brunswik (4, 6). She 
reports that while high-prejudiced subjects 
are oriented more toward the following of 
arbitrary rules, low-prejudiced subjects are 
oriented more toward the following of prin- 
ciples. Her further finding that for high- 
prejudiced subjects there is a greater personal 
“tinting (of) political and social events” (6, 
p. 306) is congruent with the present finding 
of greater concreteness of definitions among 
more-prejudiced subjects. 

The results found, furthermore, seem to 
suggest that while different degrees of 
tolerance-intolerance may, perhaps, be equally 
resistant to change, this does not necessarily 
mean that all such phenomena may be sub- 
sumed under the rubric of rigidity. The 
processes underlying such resistances to 
change do not appear to be the same at the 
low-prejudice extreme (Quartile I) as those 
operating at other points on the prejudice 
continuum (Quartiles II, III, and IV). If 
concreteness is taken as a criterion of rigidity, 
as was suggested earlier, then it would fol- 
low that the subjects falling within Quar- 
tile I are characterized by less rigidity than 
those falling within the remaining three 
quartiles.”° 

The differences reported here do not clearly 
appear to be a function of differences in 
intelligence among the four quartile groups. 
Apparently we are getting at dynamic factors 
here, as we did in our earlier study (19, 20), 
which remain relatively untapped by stand- 
ard intelligence tests. 

The finding that the subjects in Quartiles 
II, III, and IV more frequently defined the 
concepts in terms of such concrete objects as 
Catholics, Protestants, Jews, etc., can be par- 


10In a second part of this study the subjects were 
asked to describe in what way they thought the 10 
concepts were interrelated. This part of the study will 
be described elsewhere (23, 24). Suffice it to say here 
that more-prejudiced subjects more frequently organized 
the 10 concepts in a marrow and isolated manner, while 
the less-prejudiced subjects more frequently organized 
the 10 concepts into one comprehensive whole. 


tially understood in terms of the dynamic 
interrelationships existing among prejudice, 
rigidity, and concreteness of thinking. 
Converging lines of evidence from the 
social-psychological and psychopathological 
investigations already cited suggest that both 
concreteness of thinking and rigidity may 
be conceived as means of defense against 
threat and, hence, that higher-prejudiced 
individuals feel threatened more frequently 
than do low-prejudiced individuals. 

But why, it may be asked, should the sub- 
jects falling within the lowest prejudice 
quartile give more reification definitions than 
those falling within the remaining three 
quartiles? While we do not have any clear 
notion about this at the present time the 
thought occurs to us that tolerant persons 
have been frequently observed to be dogmatic 
in their beliefs.’ On a strictly intuitive 
basis, we could not help but wonder whether 
reification could possibly be one manifesta- 
tion of dogmatism. Unfortunately, the data 
obtained in this study are inadequate to per- 
mit an answer to this question. At any rate, 
the posing of the tentative hypothesis that 
reification of thought may be a manifestation 
of dogmatism immediately opens up several 
new problems of research. No empirical 
studies, as far as the writer could determine, 
are available either on reification or on dog- 
matism. Some methodological and theoreti- 
cal problems which suggest themselves for 
further research are: Can dogmatism be 
operationally defined and reliably measured? 
Is resistance to change when the objective 
conditions demand it sometimes a function 
of concreteness of thinking (rigidity) while: 
at other times a function of reification of’ 


thinking (dogmatism)? What are the struc- 


tural and functional (dynamic) determinants 
of reification and dogmatism on the one 
hand as against concreteness and rigidity on 
the other hand? Preliminary investigations 
designed to answer some of these questions 
are now under way. 

Turning now to a consideration of other 
aspects of our data, the reader will recall 


11 At first glance dogmatism and rigidity appear to be 
synonymous. However, a distinction between these two 
concepts may be made. Whereas rigidity, as we have 
seen, refers to attitude organizations at a concrete level, 
dogmatism seems to refer to organization of belicfs and 
attitudes at an abstruct level. 
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that the differential patterns of definitions 
given by the high- and low-prejudice groups 
reach statistical significance for the religious 
concepts but not for the political-economic 
concepts. This is consistent with the view 
that the concept of ethnocentrism, while 
positively related with attitudes toward 
political-economic groups, has more to do 
with attitudes toward religious and ethnic 
groups (7, 14). 


SUMMARY 


Theoretical considerations based upon 
earlier research led to the hypothesis that 
the attitudes of high-prejudiced individuals 
toward various social groups are more fre- 
quently organized in terms of concrete social 
objects while the attitudes of low-prejudiced 
individuals are more frequently organized in 
terms of the abstract principles for which 
various social groups stand. 

On the basis of the 10-item Ethnocentrism 
Scale constructed by Levinson for the Berke- 
ley Public Opinion Study, 144 freshmen were 
divided into four equal prejudice quartiles. 
These subjects were asked to define the fol- 
lowing 10 concepts: Buddhism, Capitalism, 
Catholicism, Christianity, Communism, De- 
mocracy, Fascism, Judaism, Protestantism, 
Socialism. Five of these are religious con- 
cepts; the other five are political-economic 
concepts. 

The definitions were categorized as ab- 
stract, reified, concrete, or, miscellaneous. 
There was 67.5 per cent agreement of cate- 
gorizations between two judges. 

Despite this relatively unsatisfactory relia- 
bility it was found that the subjects falling 
within the lowest prejudice quartile gave 
somewhat more abstract definitions, more 
reification definitions, and less concrete defi- 
nitions than did the subjects falling within 
the remaining three quartiles. These differ- 
ences in patterns of definitions were, on the 
whole, statistically significant for all the 10 
concepts taken together and for the five 
religious concepts. 

It was found, furthermore, that a signifi- 
cant negative relationship existed between 
prejudice and intelligence (as measured by 
the American Council on Education Test). 
The differences in intelligence among the 
four quartiles do not appear to account for 


the differences found in patterns of definitions 
given by Quartile I, on the one hand, and 
Quartiles II, III, and IV, on the other hand. 

The differences found between Quartile I 
and the other three quartiles with respect to 
abstract definitions and concrete definitions 
were in the hypothesized direction. Further- 
more, Quartile I gave more definitions in- 
volving reification than did Quartiles II, III, 
and IV. It was tentatively hypothesized that 
reification might be a manifestation of dog- 
matism, which is sometimes present in low- 
prejudiced individuals. Further research to 
study the differences between the dynamic 
determinants of rigidity and dogmatism is 
indicated. 
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INTERPERSONAL COMMUNICATION IN SMALL GROUPS 


BY LEON FESTINGER ano JOHN THIBAUT ! 


Research Center for Group Dynamics, University of Michigan 


MALL face-to-face groups, or as they have 

sometimes been called, primary groups, 

play an important part in influencing 
attitudes ’and opinions of their members. 
This important fact about social behavior has 
been assumed for many years. In the past 
decade experimental facts have accumulated 
to substantiate this fact and to specify the 
relationships involved. 

In summary, the following is a list of some 
major conclusions which may be drawn from 
experimental work: 

1. Belonging to the same group tends to 
produce changes in opinions and attitudes 
in the direction of establishing uniformity 
within the group (5, 6). 

2. The amount of change toward uniform- 
ity which the group is able to accomplish is 
a direct function of how attractive belonging 
to the group is for its members (1, 2). 

3. Members who do not conform to the 
prevailing patterns of opinion and behavior 
are rejected by others in the group. The 
degree of rejection is a direct function of how 
attractive belonging to the group is for its 
members and of the importance for the group 
of the issue on which the member does not 
conform (2, 7). 

These facts leave unclarified the means by 
which such social influence is accomplished. 
The continual process of informal communi- 
cation among members of face-to-face groups 
in part represents the attempts to influence 
members by others in the group. To under- 
stand completely the social influences which 
groups exert we must, then, also understand 
the determinants of what does and does not 
get communicated in social groups, and who 
are the recipients of communications. There 
are some data available. These may be sum- 
marized as follows: 

1. Persons whose social behavior is changed 
by hearing something tend to relay this in- 


1 This study was conducted under contract with the 
Office of Naval Research (N6onr—23212 NR 151-698). 
It is part of a larger program of research on social com- 
munication and influence now being conducted at the 
Research Center for Group Dynamics at the University 


of Michigan. 


formation to others who are seen as likely to 
be affected by it (2, 3). 

2. Persons who do not conform to the 
group pattern tend to have fewer communi- 
cations addressed to them if they are rejected 
but tend to have more communication 
addressed to them if they are not rejected (7). 

A more detailed understanding of this 
process of communication and its relation to 
the process of influence is the major purpose 
of the theories and experiments reported in 
this paper. 


THEORETICAL ORIENTATION 


The fact that groups do exert pressures on 
their members toward uniformity is beyond 
dispute. For our immediate purposes we 
need not concern ourselves with the sources 
of these pressures or the reasons for their 
existence. We will look only at the effects of 
these pressures toward uniformity on the 
communication and influence process that 
actually takes place in a group. A group may 
be locked upon as composed of a number of 
parts with each part characterized by a given 
state” with respect to a certain dimension. 
If the group has the property of tending 
toward uniformity of state, then any dis- 
crepancy among the different parts of the 
group will give rise to forces which will be 
exerted on parts of the group to change their 
state in such a way as to re-establish uni- 
formity. The strength of these forces will be 
a function of the magnitude of the tendency 
toward uniformity which the group possesses. 

The force to change exerted on any par- 
ticular part of the group is also a direct func- 
tion of the discrepancies in state between 
this part and all other parts of the group. 
The larger the discrepancy between part A 
and part B, the larger will be the force 
exerted on A by B and on B by A, since the 


2 In the experiments to be described later an individual 
person is coordinated to a part of a group and an 
opittion concerning a certain issue to the state of the 
parts of the group. Cliques of people, levels in an 
organization, or work groups may also be looked on 
as parts of a group. 
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disequilibrium is greater, the greater this 
discrepancy. 

The preceding hypotheses concerning tend- 
encies toward uniformity within a group do 
not, of course, hold for any arbitrarily defined 
collection of individuals or parts. When dis- 
crepancies exist among a collection of per- 
sons, uniformity of any group that exists 
within this collection can be achieved either 
by the exertion of forces to change various 
parts of the group or, alternatively, by form- 
ing the group ir. such a way that uniformity 
already exists. Redefinition of the bound- 
aries of the psychological group (changing 
the membership composition) can, then, also 
be a response which the group makes to pres- 
sures toward uniformity. 

In a group where the tendencies toward 
uniformity concern an opinion about some 
issue, the exertion of pressures on persons to 
change their opinion must of course make 
themselves felt through a process of com- 
munication among them. What can we infer 
about this process of communication from 
the hypotheses we have presented? 

1. Within a psychological group communi- 
cations should be directed mainly toward 
those members whose opinions are extreme as 
compared to the opinions of the others. This 
would follow from our hypothesis that the 
strength of the force applied on any part of 
the group is a direct function of the dis- 
crepancy between the state of that part and 
the states of the other parts of the group. 

2. If it is possible for a group to subdivide 
or exclude members then, as the discrepancies 
in state become clear, there will be tendencies 
to cease communicating to the extremes. 
This would follow from a number of con- 
siderations that have been stated or implied 
above. 

a. If it is impossible for the group to rede- 
fine its boundaries, then uniformity can only 
be achieved through changing others and 
being receptive to change. 

b. If it is possible to redefine the bound- 
aries of the group then uniformity can also 
be achieved by omitting the persons with 
extreme opinions from the group. 

c. The perception that it is possible to 
redefine the boundaries of the group should, 
then, have two consequences. There should 
be greater resistance to change on the part 


of the members, and there should be less 
communication to those who may be ex- 
cluded from the group, namely, those with 
extreme opinions. 

3. The less the pressure towards uniformity 
in a group and/or the greater the possibility 
for the group to subdivide, the less will 
be the actual accomplishment of influence. 
Since both of the factors here mentioned will 
affect the readiness of members to change in 
response to influence which is exerted on 
them, and since possible group subdivision 
will also prevent the exertion of influence on 
the most deviant members, it follows that the 
end result of the process of communication 
will be less uniformity in the group if sub- 
division is seen as possible or if the tend- 
encies toward uniformity are weaker. 

The experiments which are described 
below were specifically designed to test these 
hypotheses. In the description of the pro- 
cedure we will elaborate further on the 
operational definitions of the theoretical 
concepts. 


EXPERIMENTAL PRoceDURE 

Subjects 

The subjects used in these experiments 
were college undergraduates recruited from 
the various sections of the elementary psy- 
chology course and the elementary course in 
educational psychology at the University of 
Michigan. All subjects were volunteers. 


the Groups 


General Characteristics of 
Formed 
Sixty-one groups were studied of which 24 
were composed entirely of women, 37 of 
men. The size of groups ranged from 6 to 
14 members. Each group assembled in the 
experimental room, and each member was 
assigned one of a series of small tables 
arranged in a circle. Each member was 
identified by a letter which was printed on 
a 5 by 8 inch card placed in front of him so 
that all others could see it. 


General Setup for All Groups 

Each group was given one problem to con- 
sider. The problem was such that opinions 
concerning it could be placed on a prescribed 
seven-point continuum. Each member was 
given seven 5 by 8 inch cards with numbers 
corresponding to those on the seven-point 
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scale of opinion. The members were in- 
structed to consider the problem and then, all 
simultaneously, to place in front of them that 
card which represented their tentative opin- 
ion on the matter at issue. The experimenter 
then proceeded to call attention to each per- 
son’s decision in order both to verify it and 
to insure that all were fully aware of it. 

Small slips of paper bearing some addi- 
tional information relevant to the problem 
were then distributed at random among the 
subjects. It was announced that each mem- 
ber of the group was receiving a different 
item of information. The purpose of this 
part of the procedure was to maximize the 
initial force to communicate by causing each 
member to believe that he had some unique 
information relevant to the problem-solving 
activity. Actually, however, only two items 
of information were distributed. One item 
was intended to push the member toward 
the upper end of the scale, the other toward 
the lower end. This device was essential to 
get adequate dispersion along the scale. 

After the subjects had read the new infor- 
mation, each recorded directly on his infor- 
mation slip his identifying letter and the 
scale number representing his current opin- 
ion. These were collected by the experi- 
menter and read aloud in order to make 
public the new opinions. Any member 
whose opinion had changed was asked to 
make the appropriate change in the num- 
bered card in front of him. 

With this preliminary procedure finished, 
the experimenter described the manner in 
which the problem was to be discussed. 
Stapled pads of paper were distributed to the 
subjects. For each pad the staple was placed 
in a slightly different position on the page. 
These differences were undetectable to the 
subjects, but they allowed the experimenter 
subsquently to match each pad with the 
member to whom it had been given. The 
subjects were informed that discussion about 
the problem had to be restricted to writing 
notes to one another. The subjects were left 
free to include anything they liked in the 
notes. However, a member could write a 
note to only one person at a time, and each 
note must bear only the letter of the person 
to whom it was directed; no reference to the 
sender’s identity was permitted. This rule 


was adopted to minimize the chances that 
any member, in the act of deciding to whom 
to direct a note, would be affected by a 
knowledge of what people had sent notes to 
him. On completing a note, a sender was to 
raise his hand, whereupon the experimenter 
or his assistant would deliver it to the re- 
cipient. It was emphasized that if and when 
any member decided to change his opinion, 
he should change the numbered card in front 
of him. 

At a signal from the experimenter the sub- 
jects then began to write notes. As each 
note was finished, the messenger (experi- 
menter or assistant) took the note, recorded 
on it the time in minutes and seconds from 
the starting signal, and dispatched it. A 
record was also kept of the exact time of each 
change of opinion, that is, of each change in 
the numbered card in front of a subject. 
The note-writing continued for 20 minutes. 


The Discussion Problems 


Two problems were used in the course of 
experimentation. A problem in _ football 
strategy was assigned to 31 of the groups, 
and a problern in evaluating a case study of 
a delinquent boy was assigned to the remain- 
ing 30 groups. 

The problem in football was concerned 
with making a decision about the best 
strategy for an imaginary anonymous team 
which has the ball on the 50-yard line, first 
down, 5 minutes of play remaining, with the 
score 18-18. Seven alternative types of 
strategies are outlined to the subjects. These 
range from extremely conservative power 
plays (at point 1) to extremely reckless pass 
plays (at point 7). The two items ot addi- 
tional information distributed among the 
subjects are that “our star running back has 
just been injured...” (intended to push 
the recipient upward on the scale) and that 
“the opposing team has tightened up its 
pass defense and has caught on to our spec- 
tacular plays” (intended to push the recipient 
downward on the scale). 

The case study was a brief fictitious 
account of the history of a boy who had 
caused trouble all through his life and who 
had ended in jail. The history of -the boy 
was deliberately made to be as ambiguous 
as possible in order to encourage dispersion 
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on the scale of opinion about the best pos- 
sible way of treating the case. The subjects 
were told that by prior decision of the social 
workers assigned to the case, the boy was to 
be put into a foster home; the assignment 
for the subjects was to determine the best 
type of home for this boy. The scale of 
opinion consisted of seven alternative types 
of foster homes, ranging from one in which 
love and kindness were exclusively empha- 
sized (point 1) to a home in which discipline 
and punishment were exclusively used (point 
7). The two items of additional information 
received by the subjects were: (1) that for a 
period of a year his mother, acting on the 
advice of a social worker, had tried to make 
the boy’s home life warm but that it did no 
good, since his criminal activity increased 
(intended to push the recipient upward on 
the scale) and, (2) that the boy’s old- 
est brother had returned home for a while 
and had given the boy stern but fatherly 
discipline but that the boy’s delinquency only 
worsened (intended to push the recipient 
downward on the scale). 

The selection of these two problems was 
guided by our need to create discussion situ- 
ations in which there would be markedly 
different amounts of resistance to change of 
opinion. In the case study problem, it was 
felt that subjects would bring into the ex- 
perimental situation fairly strong predispo- 
sitions toward certain of the scaled opinions. 
These predispositions could be expected to 
be quite resistant to change. 

In the football problem, on the other hand, 
there was no expectation that strong prejudg- 
ments would be imported into the situation. 
Relative to subjects working on the case 
study problem, the subjects ought more 
readily to accept the present experimental 
group as a relevant reference group for their 
opinions and hence ought to be relatively 
less resistant to change. 


Experimental Variations 


Six experimental variations were applied 


to each of the two problems. These vari- 
ations were created by further instructions 
over and above the general instructions 
already described. Five groups (three male 
and two female) were assigned to each of the 
experimental variations in each of the prob- 


lems, except for variation V in the football 
problem, which had six groups, four male and 
two female. 


Instructions to Create the Homogeneity- 
Heterogeneity Variable 


In the first three variations, the intention 
of the additional instructions was to create 
a perception that there was no basis for group 
subdivision among them. To achieve this 
perception the homogeneity of the group was 
emphasized as follows: “You people in this 
group have been deliberately selected to 
make up the kind of group we are interested 
in observing. You have been selected in 
such a way that we believe you all will have 
about an equal interest in this problem and 
about equal knowledge about it... .” 

In the second three variations it was in- 
tended that the subjects perceive the possi- 
bility of group subdivision as having a basis 
in fact. The heterogeneity of the group was 
emphasized by telling the members that they 
had been selected to be as different as pos- 
sible both in their interest in the problem 
and in their knowledge about it. 


Instructions to Create the Pressure-Toward- 
Uniformity Variable 


Variation I (High pressure-homogeneity, 
abbreviated H-Hom). In this variation we 
were interested in creating very strong pres- 
sures toward uniformity of opinion. The 
group was told that the experimenter’s inter- 
est was in observing how a group went about 
coming to a unanimous decision. Thus, 
whatever intrinsic pressures toward uni- 
formity might exist in the group were 
strengthened by externally induced pressures. 

Variation Il (Medium pressure-homogene- 
ity, abbreviated M-Hom). ‘This variation 
was designed to produce pressure toward 
uniformity of a conditional nature. The 
instructions were that a body of experts (the 
coaching staff of the University of Michigan 
football team, for the football problem, and 
some members of the Law School faculty for 
the case study problem) had considered the 
problem and had unanimously decided that 
one of the seven scale points represented the 
“correct solution.” The group was told that 
it would receive a score for its performance, 
which would be the proportion of members 
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who at the conclusion of the experiment were 
recommending the “correct solution.” 

Variation Ill (Low pressure-homogeneity, 
abbreviated L-Hom). No external pressure 
toward uniformity was applied in this vari- 
ation. The group was merely informed that 
the experimenter was interested in observing 
the way a group went about discussing such 
a problem. In this case, it was supposed chat 
if any pressure toward uniformity developed 
it would be attributable to a need for “social 
reality” within the group (2, 4). According 
to this principle, there is a force on the group 
member to achieve support for his point of 
view; and to the extent that this point of 
view is untestable by demonstration, the 
member is increasingly required to accept 
the criterion of social agreement with a rele- 
vant reference group. 

Variation IV (High pressure-heterogene- 
ity, abbreviated H-Het). In this variation we 
were intent on establishing high pressure 
toward uniformity while at the same time 
permitting the formation of subgroups. The 
variation includes instructions that the group 
is composed of heterogeneous members. 
Otherwise it is largely a counterpart of 
Variation I (H-Hom). This time, however, 
instead of asking for a unanimous decision, 
the experimenter informed the group that a 
plurality would be sufficient. The group 
would be taken as recommending the de- 
cision which the greatest number of mem- 
bers accepted. In addition, the subjects were 
told that in such heterogeneous groups as 
this, one usually did not find more than 
twenty per cent of the members agreeing on 
any one alternative. These last two instruc- 
tions were made somewhat different from 
the instructions in the homogeneity condi- 
tions in order to allow sub-group formation 
to take place. 

Variation V (Medium _pressure-hetero- 
geneity, abbreviated M-Het). This variation 
was also expected to permit subgroup forma- 
tion. The instructions to these groups were 
substantially the same as for variation II 
(M-Hom) except for the emphasis on hetero- 
geneity of the members and an additional 
instruction that it was not customarily pos- 
sible for more than twenty per cent of the 
group to hit upon the “correct solution.” 

Variation VI (Low pressure-heterogeneity, 


abbreviated L-Het). Except for the pretense 
that the group was heterogeneously com- 
posed, this variation was precisely the same 
as Variation III (L-Hom). ; 

The following tabulation is presented to 
help clarify the relations among the six 
experimental conditions: 


PressuRE Toward UNIFORMITY 








Hich Mepium Low 





Homogeneous Group I Il Il 
Heterogeneous Group IV Vv VI 





ExpERIMENTAL RESULTS 


Hypothesis I 

The volume of communication between 
two persons is a function of the magnitude 
of the discrepancy between their opinions. 
Since extreme opinions are most discrepant 
from all the other opinions, we would there- 
fore predict that most communications should 
be directed towa-d members who hold ex- 
treme points of view. 
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Fic. 1. 


Figure 1° summarizes the experimental 
findings relevant to this prediction in terms 
of the weighted number of communications. 
The distribution of opinions within the 
group could affect the pattern of communi- 
cation. Thus, for example, if six members 
held extreme opinions and only three mem- 
bers maintained “middle” opinions, we would 
obtain a preponderance of communication 
to the extremes even if members were 
addressed at random. To correct for this, 
each message was weighted by the inverse 
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of the number of persons in the group in the 
same relationship to the communicator as 
the recipient of that particular message. 
Thus, a communication directed toward a 
person at an extreme was divided by the 
number of persons in the group (excluding 
the sender of the message) who held extreme 
opinions at the time. When the weighted 
number of communications initiated during 
the first ten minutes* of each session is 
plotted against the location of the recipient 
(in terms of being at an extreme position, 
one point away from the extreme position, 
etc.), the curve falls off rapidly. This rela- 
tionship seems to hold about equally for 
groups discussing the football problem and 
for groups discussing the case study prob- 
lem. Our hypothesis is confirmed—the 


munications according to the location of the 
recipient (as in Fig. 1) was computed. For 
example, the mean of the distribution for the 
football problem in Figure 1 is 84 units 
away from the extreme opinion. This mean 
value is taken as an index of the tendency to 
communicate to the extremes. Low values 
of the index indicate a high proportion of 
communication to the extremes. 

Table 1 presents these indices separately 
for the first and second 10 minutes of dis- 
cussion for each experimental variation on 
the football problem. Table 2 gives the same 
data for the case study problem. In order 
to examine these data from the point of view 
of hypothesis II we will compare the indices 
of the first 10 minutes with the indices of the 
last 10 minutes in each variation. If our 


TABLE 1 


MEAN CoMMUNICATION INDICES FOR FooTBALL ProsLeM Discussions 








First TEN MINUTES 


Hicu MEDIUM Low 


Seconp TEN MINUTES 


HicH MEDIUM Low 





Hom .68 .85 . 88 
Het -83 83 - 86 


-74 -63 - 86 
-75 1.30 -99 





volume of communication directed toward a 


group member is a function of his nearness 
to the extreme of a range of opinions. 


Hypothesis Il 


Since communication tends to be directed 
toward the extremes of a psychological group, 
it is predicted that where the formation of 
subgroups (redefinition of the boundaries of 
the group) is possible there will be less com- 
munication directed toward the extremes of 
the experimental (arbitrarily defined) group. 
Since the heterogeneity condition provided 
more basis for subgroup formation than did 
the homogeneity condition, we may expect 
greater decreases in communication toward 
the extremes in the former as subgroups are 
given time to develop. 

This hypothesis was tested in the following 
way. For each experimental group the mean 
value of the frequency curve showing the 
distribution of weighted number of com- 

8 Exactly the same type of curve is found for the 


second ten minutes of discussion. The curve is so con- 
sistent that only two are shown as examples. 


hypothesis is correct we would expect to find 
the indices increase for the heterogeneity 
conditions more than for the homogeneity 
conditions. 

Examining the homogeneity conditions 
first we find no tendency toward any change 
from the first to the second 10 minutes. For 
the high pressure condition there is an ex- 
tremely slight and insignificant increase for 
both discussion problems. For the medium 
pressure condition there is a tendency for the 
index to decrease, which again does not 
approach significance. For the low pressure 
condition the index stays virtually the same 
for the football problem and increases insig- 
nificantly for the case study problem. 

In the heterogeneity conditions a quite dif- 
ferent picture presents itself. In the high 
pressure condition there is no change in the 
index, but in the medium and low pressure 
conditions there are consistent increases in 
the index from the first to the second 10 
minutes. Two of these four increases, the 
medium condition for the football problem 
and the low condition for the case study 
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problem, are significant at the 5 per cent level 
of confidence. Taken together the changes 
in the medium and low pressure conditions 
are significant at the 1 per cent level of 
confidence. 

These results seem to substantiate but 
qualify hypothesis II. While the homogene- 
ity conditions show no increase in the index, 
the heterogeneity conditions show such an 
increase only where the pressure toward uni- 
formity is sufficiently low to permit subgroup 
formation. In the high pressure conditions 
where strong pressures toward uniformity 
are exerted by the experimenter on the total 
group, subgroup formation does not occur. 


sures toward uniformity. When pressures 
toward uniformity become very high, these 
other forces in the situation may become less 
effective in comparison. 


Hypothesis Ill 

As pressure toward uniformity increases, 
both pressure to communicate and readiness 
to change also increase. Since both of these 
factors are conducive to change, there should 
be increasing change toward uniformity of 
opinion as the pressure toward uniformity 
increases. 

In order to test this hypothesis a measure 
of the amount of change toward uniformity 


TABLE 2 


Mean CoMMUNICATION INpIcES FoR Case Stupy Prositem Discussions 








First Ten MINUTES 


Hicu MeEpDIuM 


Seconp Ten MINUTES 


Hicn MEDIUM Low 





-27 -62 
31 -50 


-56 
-72 


Hom 35 
Het -30 


-74 
-78 





Where the pressure toward uniformity is 
weaker, subgroup formation does occur 
when a basis for it (perception of hetero- 
geneity) exists. 

It is also apparent from Tables 1 and 2 
that in both the homogeneity and hetero- 
geneity conditions, increasing the magnitude 
of pressure toward uniformity produces more 
communication toward the extremes. If we 
compare the indices for the high pressure 
and low pressure conditions we find that in 
the eight possible comparisons, the index for 
low pressure is greater in seven instances and 
tied in one instance. The index for medium 
pressure is higher than for high pressure in 
six of eight possible comparisons and tied 
in one instance. There is no consistency in 
the comparison between the medium and low 
pressure conditions. 

In view of the consistency of the result we 
may conclude with a high degree of confi- 
dence that high pressure toward uniformity 
results in increased communication to the 
extremes. This result probably depends upon 
the degree to which tendencies to communi- 
cate arising from other sources can compete 
with communications resulting from pres- 


was calculated for each experimental group. 
The index used was the quotient of the 
standard deviation of opinions within the 
group by the end of the 20-minute discussion, 
divided by the standard deviation within the 
group at the beginning. The lower the 
index, the greater has been the change 
toward uniformity of opinion. Thus, for 
example, an index of 1.0 represents no change 
at all, and this value may be regarded as a 
base line in the figure. 

Figure 2 presents these indices for each of 
the experimental variations and for each of 
the discussion problems. It can be seen that 
in each instance, as the pressure toward uni- 
formity is decreased, the amount of change 
toward uniformity is decreased. The trends 
may be regarded as significant well beyond 
the 1 per cent level of confidence, since the 
probability of obtaining this predicted order 
of three points in four independent com- 
parisons would be by chance about one 
in a thousand. The data fully support 
hypothesis III. 


Hypothesis IV 


If subgroup formation is seen as possible, 
the readiness to change when influence is 
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exerted should be less than where no sub- 
group formation is possible. In addition, in 
the former case there is less actual exertion 
of influence on the extreme opinions in the 
group. Both of these factors should combine 
to produce less change toward uniformity in 
the heterogeneity than in the homogeneity 
conditions. 

Figure 2 shows the data relevant to this 
hypothesis. The difference between the 
amount of change in the heterogeneity and 
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PRESSURE TOWARD UNIFORMITY 


MEAN AMOUNTS OF CHANGE TOWARD 
UNIFORMITY OF OPINION 


Fic. 2. 


homogeneity conditions is highly significant 
(beyond the 1 per cent level by analysis of 
variance) when the football problem is dis- 
cussed. There is, however, little or no differ- 
ence between these two conditions when the 
case study problem is discussed. 

It will be recalled that the case study prob- 
lem was selected in the belief that subjects 
would bring with them fairly strong predis- 
positions toward certain of the opinions, 
which would be relatively more resistant to 
change. The football problem was selected 
in the belief that subjects would not bring 
such predispositions into the experimental 
situation. This difference between the two 
discussion problems is clearly reflected in the 
much lower degree of change toward uni- 
formity in the case study problem. It is also 
probable that this relatively high resistance 


to change in the case study problem made 
the added effect of the heterogeneity-homo- 
geneity difference relatively negligible. 

We may conciude that, where strong pre- 
dispositions do not exist and where, conse- 
quently, the group has power to change 
opinions, the perception of heterogeneity 
will increase resistance to change. Hypothe- 
sis IV, thus amended, may be considered to 
be substantiated. 


SUMMARY 


The variables of (1) amount of pressure 
toward uniformity existing in a group and 
(2) the degree to which the members per- 
ceived the group as homogeneously com- 
posed, were manipulated experimentally in a 
laboratory setting of a discussion group to 
test certain hypotheses concerning the pat- 
tern of communication within the group and 
the amount of change in opinion which 
occurs. The results strongly support the 
theoretical hypotheses and may be sum- 
marized as follows: 

1. When there is a range of opinion in the 
group, communications tend to be directed 
towards those members whose opinions are 
at the extremes of the range. 

2. The greater the pressure toward uni- 
formity and the greater the perception of 
homogeneous group-composition, the greater 
is the tendency to communicate to these 
extreme opinions. 

3. The greater the pressure toward uni- 
formity and the greater the perception of 
homogeneous group-composition, the greater 
is the actual change toward uniformity which 
takes place. 
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INTELLIGIBILITY OF SPOKEN MESSAGES: LIKED AND 
DISLIKED 


BY HARRY M. MASON anv BARBARA K. GARRISON 
Whitman College 


ECENT dynamically oriented studies (1, 
R 4) purport to show that values and 

needs are significant factors in the per- 
ception of stimuli. The present experiment, 
an attempt to determine whether such ele- 
mental value responses as liking and dislik- 
ing are of importance in voice communica- 
tion, suggests that familiarity with content 
ideas, rather than strength of related values, 
may be the crucial factor in such results. 

The present problem is to determine which 
stimuli college student subjects (Ss) tran- 
scribe the more accurately: noise-blurred 
three-word messages suggesting activities the 
Ss would like or similar messages suggest- 
ing activities Ss would dislike. 


METHOD 


Two upper-division classes in psychology 
served as Ss. Group I, Industrial Psychology, 
contained 10 males and four females; Group 
II, Child Psychology, contained 20 females 
and six males. Experimental sessions for the 
two groups were separated by approximately 
two weeks. 

Sixty-three three-word sentences were con- 
structed; each was a request to participate in 
some activity; each was composed of two one- 
syllable words followed by a two-syllable 
word accented on the first syllable. Twenty- 
one sentences were intended to suggest activi- 
ties generally liked by college students; like 
numbers were intended to suggest activities 
toward which students would be indifferent 
and activities which they would dislike. Care 
was taken to use words commonly employed 
in speech by college students. Examples, in 
the order of liked, indifferent, and disliked, 
according to experinenters’ intentions, were: 
Dress for dancing; blow out matches; and get 
wrong answers. Sentences were recorded on 
wire by a practiced speaker, who monitored 
his voice level by means of an output meter. 
Statements, originally listed according to in- 
tended attitude response, were randomized. 
An announcement, Number preceded 


each sentence. Ten seconds elapsed between 
sentences. 

This recording was copied to another wire, 
being played for re-recording at a low level, 
in order to increase deliberately the noise in 
relation to the signal. By re-recording three 
times in this fashion, a copy exhibiting a 
highly uniform and satisfactory noise level 
was obtained. Instructions, “explaining” 
that this was a test of ability to transcribe 
emergency telephone messages, and indicat- 
ing how to mark blanks, were recorded 
to precede the test items. The Ss were in- 
structed to copy what they heard or thought 
they heard. The need to be accurate when 
emergencies exist and communications are 
likely to be poor was stressed in order to 
lessen any possible tendency toward ego-in- 
volvement which might have been set up by 
the intimation that this recording was a 
“test.” 

Recordings were made on a Webster-Chi- 
cago Model 80 wire recorder which had been 
modified to extend high and low frequency 
cutoff points and to provide additional tone 
controls. A second machine, unmodified, was 
used to play recordings for copying. The 
signal was fairly understandable to one accus- 
tomed to it, but difficult upon first hearing. 
It resembled stimuli used in communications 
listener-training experiments during the re- 
cent war as exemplified in the work of Mason 
(3) and others, rather than the “verbal 
summator” stimulus made familiar by Skin- 
ner (5) as an “auditory ink-blot.” 

The completed recording was played for 
the Ss during regular class periods, a 10-inch 
loudspeaker situated on the instructor’s table 
being used for this purpose. Since each S 
was his own control and messages were 
randomized, it was not necessary to make the 
over-all difficulty of the test strictly com- 
parable from S to S. 

After the Ss had written out all messages 
as well as they could, they were given mimeo- 
graphed blanks containing correct copies of 
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the messages; each message was followed by 
the letters: L, J, I, d, D. Instructions indi- 
cated that the L should be encircled to indi- 
cate “I like very much to do this,” the / for “I 
like to do this,” etc. This attitude record pro- 
vided for consideration of individual likes 
and dislikes and checked the experimenters’ 
judgment concerning the reactions of the 
students to messages. 

Although the intelligibility of the words 
used in the messages should be roughly 
equated through use of the same syllable 
structure in all messages, it was desired to 
control this variable more precisely. To do 
this, a second stimulus recording was made, 
presenting first and last words of the sen- 
tences used in the messages. The recording 
was made in the following manner: Each 


10I 


two-syllable last half receiving somewhat 
more blurring. 

This word-intelligibility test was presented 
to a class of 25 lower-division psychology 
students, whose correct transcriptions of 
words served as indexes of their relative in- 
telligibility as spoken in the sentence test. 


RESULTS 


In scoring attitude responses of Group I 
and Group II to the sentence messages, it was 
found that 22 sentences were rated “liked” by 
half or more of each group. Twelve sentences 
satisfied the same criterion with regard to dis- 
like reactions. The word-intelligibility of the 
22 generally liked sentences averaged substan- 
tially higher than that of the generally dis- 
liked sentences." By use of the word-ratings 


TABLE 1 


LisTENING Scores 2 RELATED TO ATTITUDES TOWARD MEssAGEs 








LIKE 
MEAN 


S.D. 


Lixe-—Dis_ixe 
MEAN 


DisLike 
Mean SD. 


INDIFFERENT 
Mean SD. 





Individual reactions 
Group I 
Group Il 
Generally liked and disliked messages 
Group I 
Group II 
Generally liked and disliked messages 
paired for word-intelligibility 
Group I 
Group II 


41.7 11.6 
39-4 12.8 


16.5 
17-5 





Score is per cent of liked, indifferent, or disliked statements correctly transcribed. 
* Significantly greater than zero, at the 1 per cent level of confidence. 
** Significantly greater than zero, at the 2 per cent level of confidence 


sentence was copied from the original record- 
ing; the recorder was then switched to play- 
back without disturbing its volume setting, 
and the last two words were carefully deleted 


with a permanent magnet. The next sen- 
tence was recorded, treated in the same way, 
and so on. When the last sentence was done, 
the entire process was repeated, except that the 
first two words, rather than the last two, were 
erased. By inserting carrier phrases, “Num- 
ber one,” etc., a typical word-intelligibility test 
was produced. The test, thus made, repro- 
duced the words in the sentence test in their 
exact relative intelligibilities, yet presented 
them separately, out of meaningful context. 
The recording was given a noise treatment 
similar to that used on the sentence test, the 


it was possible to match 10 generally liked 
and 10 generally disliked messages for word- 
intelligibility of their principal components. 
No pair showed a discrepancy greater than 6 
per cent in either first or last word position. 
The average difference for first words was 


1 For first words in generally liked messages, mean 
intelligibility was 67.9 per cent, S.D. 31.8; for last 
words, mean was 59.8, 5.D. 29.4. Corresponding values 
for generally disliked messages were: mean for first 
words, 61.3, S.D. 32.5; mean for last words, 43.3, 
S.D. 33.8. 

2 Those interested in further analyses of scores may 
obtain lists of subjects in Groups I and II, with their 
scores, from American Documentation Institute, 1719 N 
Street, N.W., Washington 6, D. C., by ordering Docu- 
ment 3033 and remitting $1.00 for microfilm (images 
1 inch high on standard 35 mm. motion picture film) 
or $1.0/ for photocopies (6x 8 inches) readable without 
optical aid. 
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1.6, for last words, 1.2. Both favored gener- 
ally disliked messages. ‘Though middle 
words were not rated for intelligibility, sen- 
tence test responses showed that middle 
words were missed slightly more frequently 
in liked messages. Thus the paired disliked 
sentences were slightly favored in word in- 
telligibility. The 10 pairs of sentences are 
shown in Table 2. 

In scoring accuracy of transcription of 
messages, a message was counted correct only 
if all three words were correctly transcribed. 
Attitude responses of L (I like very much 
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fully represented. It is apparent, however, 
that the messages merely liked by a majority 
are more favored. This circumstance led to 
further analysis, which is presented in Table 


2. 

In Table 2, the 10 pairs of sentences, 
equated for word-intelligibility, are examined 
for each pair’s contribution to the mean 
difference found. This analysis concerns 
Group II only, since Group I was too small 
to give significance to such a breakdown. It 
may be observed that five of the pairs give 
differences in favor of the liked sentences; 


TABLE 2 


Pairs oF SENTENCES Equatep For Worp Dirricu.ty, with FREeQueNcies oF Correct TRANSCRIPTION 
in Group II, N=26 








Likep MESSAGE 


DistikeD MESSAGE DIFFERENCE 





Swing in hammock. 
Win at poker. 
Swing your partner. 
Dress for dancing. 
Throw a party. 
Pass the turkev. 
Earn a fortune 
Play with baby 
Find ten dollars. 
See your buddies. 


Shake out ashes. 

Pay your taxes. 
Choke from coughing. 
Clean the stable. 
Have a headache. 
Scrape on blackboard. 
Get wrong answers. 
Burn your finger. 

Go to dentist. 

Do hard labor. 
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17 
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to do this) and ? (I like to do this) were 
lumped as /rkes; those of d and D as dislikes. 
Table 1 shows the principal relations between 
attitude and sentence test scores. Where 
means are indicated as significantly different, 
the test used was Student’s #, applied to dis- 
tributions of difference scores. 

When each S’s score on likes, indifferences, 
and dislikes is expressed as a percentage of all 
statements he marked in this manner, mean 
listening performance is correlated with de- 
gree of liking. Generally liked and disliked 
messages show the same trend, but to a 
greater degree. Even after sentences are 
equated for word-intelligibility, reliable dif- 
ferences favoring liked messages persist. The 
differences characteristic of successive steps 
in analysis of scores may be seen by reading 
the last column of Table 1. If value attach- 
ing to the messages were the principal causal 
factor in the enhanced intelligibility of liked 
messages, the differences associated with in- 
dividual reactions should be greatest, for in 
this comparison, each S’s likes and dislikes are 


the other pairs favor the disliked, but not 
sufficiently to destroy the significant mean dif- 
ference favoring liked messages. Inspection 
of this table suggests to the writers that de- 
gree of familiarity with the messages, not lik- 
ing or disliking, may be the causal factor in 
the mean difference favoring generally liked 
messages.” If the Ss had rated the statements 
for frequency with which they are matters of 
concern, rather than according to liking, it is 
possible that much greater mean differences 
would have been obtained. Since what one 
likes and what is familiar to him are likely 
to be positively correlated, it is suggested that 
future experiments examining possible rela- 
tionships between values and _ perception 
should control familiarity with stimuli rep- 
resenting various value conditions. Results 

8 Our interpretation resembles to a considerable degree 
that of Howes and Solomon (2), which has appeared 
since this study was done. They have found that “the 
duration at which verbal discrimination appears is of 
the same order for taboo and for neutral words, when 
the effects of Thorndike-Lorge frequencies are extracted” 


(p. 233). “Frequency” for words and “familiarity” of 
messayres would appear to have much in common. 
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of all the studies examined by the writers 
which purport to show relationships between 
deeply rooted value systems and perception 
could be explained on the basis of familiarity, 
rather than value, if the investigator were so 
disposed. 

Further research comparing the relative 
potency of familiarity and values as influences 
upon message intelligibility is now in prog- 
ress. Other studies might well explore pos- 
sible effects of task-orientation as opposed to 
ego-orientation in experiments which require 
subjects to interpret ambiguous stimuli. 


SUMMARY 


In summary of the present experiment, 
liked voice messages were better transcribed 
by college student Ss than were disliked mes- 
Consistent results were obtained in 
two groups of Ss. Analysis of the Ss’ scores 


sages. 
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suggested that valuing behavior affected ac- 
curacy of transcription. Further analysis, ex- 
amining messages item by item, suggested 
that familiarity with message ideas is a more 
plausible explanation for the results than is 
valuing behavior. 
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CASE REPORT 


MEASUREMENT OF SOME EFFECTS OF ELECTROCONVULSIVE 
THERAPY ON THE !NDIVIDUAL PATIENT* 


BY GERALD R. PASCAL? anp JEAN B. ZEAMAN 


Butler Hospital 


GREAT deal of literature has accumulated 
A with respect to the effects of electro- 
convulsive therapy (2, 8, 19, 41). Out 
of numerous studies and clinical observations 
have grown conflicting theories regarding the 
causes of therapeutic success or failure, sug- 
gestions as to the number and spacing of 
treatments, and indications for prognosis (g, 
10, 12, 14, 20, 23, 29, 36, 37, 51). Certain 
psychoanalytically oriented writers tend to 
attribute the success of electroconvulsive 
therapy to the psychological meaning of the 
treatment or to the mobilization of defenses 
occasioned by the threat of destruction in- 
herent in the treatment (47, 37, 6). Some 
writers tend to favor the view that improve- 
ment is related to a decrease in inhibitory 
function of the cortex (10) or an increase in 
adrenal sympathetic function (12). Other 
writers point to the decrement of the more 
recently learned of two habits, by ECT (55, 
52, 36). The number and spacing of treat- 
ments is a matter of some dispute. On the 
one hand, there are workers who contend 
that a sufficient number of treatments should 
be given so that the patient becomes con- 
fused between treatments (23, 24), but, on 
the other hand, some writers hold to the 
opinion that treatment should be spaced so 
as to prevent intellectual impairment (14, 
51). The first hold that confusion at the 
beginning of treatment is a sign for good 
prognosis, while the second group hold that 
initial confusion is a sign for a poor 
prognosis. 

In order to subscribe to one or more of 
these theories it is necessary to ascertain 
objectively, insofar as possible, the course of 
the patient’s progress during and after treat- 


1The authors are indebted to Dr. Stanley Michael 
who, on the basis of preliminary work, suggested some 
of the measures used in this study. 

2 Now at the Western Psychiatric Institute and Clinic, 
University of Pittsburgh. 


, Providence, R. 1. 


ment. Systematic recordings which estimate 
cortical functioning need to be taken as 
routinely as blood count, temperature, or 
pulse rate. Such measures should be easily 
and rapidly administered and diagnostic of 
the patient’s progress. 

Most studies have measured the reaction 
of the organism at one or more levels of 
functioning, averaging the results for a num- 
ber of cases. The present study reports ex- 
tensive findings on an individual case, using 
measures which we suppose are directly 
related to the effects of ECT and which fol- 
low the course of the patient’s reaction to it. 
These measures are then intercorrelated, and 
four measures which seem to be the most 
efficient for following the patient's reaction 
to treatment are extracted and presented. 
These four measures can be administered by 
nurses on the ward in less than 20 minutes. 
The use of the measures in clinical practice 
is illustrated on several cases. 


METHODs AND PROCEDURE 


A battery of tests was assembled which, it 
was hoped, would get at various levels of 
functioning. For the study of the individual 
case it was necessary that tests be used, the 
results of which could be compared to the 
findings of other investigations. These tests 
and the methods of scoring them are de- 
scribed in Table 1. For simplicity of refer- 
ence, code numbers are assigned to test 
batteries. Table 1 shows these numbers. 

On shock days the procedure was as 
follows: 

30 minutes before ECT, P, 
5 minutes before ECT, B 
20 minutes after ECT, B 
30 minutes after ECT, P, 
110 minutes after ECT, B 
120 minutes after ECT, P, 

On non-shock days the patient was admin- 

istered P, and B procedures 24 hours after 
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TABLE 1 


DescripTION OF TESTS 














METHOD 





White Blood Count 


Lymphocyte Count 


Coagulation Time 


Reaction Time (13, 50) 


Ergograph (13) 


Orientation Test 
All of Pi 
Noun Naming (11) 


Serial Subtraction (48) 


Color Naming (48) 

Thematic Apperception Test 
Blank Card (28) 

Digit Span (48) 


Continuous Addition (48) 


Bender-Gestalt Test (3, 18, 32, 33) 


Wechsler-Bellevue (46) 
Rorschach (1, 21) 


Procedure B* (B) (25)t 

Both chambers of the counting chamber filled, both sides counted, and results 
averaged. Blood was extracted from the finger tips. 

Score—average count. 
Five hundred cells counted in the slide method of doing a differential, and the 
results averaged. 

Score—per cent lymphocytes. 
Method used is that described by Best and Taylor (4) in which blood is drawn into 
a capillary tube. The tube was kept in a constant temperature bath and broken 
every five seconds after two minutes had elapsed. A thread fiber about one inch 
long was the criterion for coagulation. 

Score—time in sec. 


Procedure 1 (P:) 
Auditory Simple reaction to a buzzer. 
Visual Simple reaction to a light. 
Choice Left hand to right light, and right hand to left light. 
Score—Means, in sec., of 20 trials. 
Pulling with right index finger for three minutes against spring with 15 pounds, 
tension. 
Score—length of the baseline in cm. 
Year XIV, 5, Stanford Binet, Form 1. (Revised)(45) Comparable variations. 
Score—number correct. 


Procedure 2 (Ps) 


Number of nouns named in two minutes with eyes closed. 
Score—number of nouns. 
Subtracting 7 from 102, 103, 104, 101, 100, etc., varying starting number with 
each test. 
Score—time in sec. 
Naming colors given in Wells-Ruesch Handbook (48). 
Score—time in sec. 
Subject asked to make up a story to card 1 and then blank card. 
Score—number of words in response to blank card. 
Digits forward and reversed. 
Score—total number of digits forward and reversed. 
Test administered for 10 minutes. 
Score—average time in sec. to do 10 additions. 
Copying nine gestalt figures. Method of administration and scoring is that 
described by Pascal (33). 
Score—quantitative score in terms of deviations from stimuli, the higher the 
score the greater the deviation. 


Procedure 3 (Ps) 


Electroconvulsive Therapy (ECT) 120 volts AC, .3 sec., grand mal seizures. 





* The authors are indebted to Miss Felicia Craddock, Chief Medical Laboratory Technician, who was responsible for all measures 


of Procedure B. 


+ Numbers in parentheses indicate main references to test procedures. 


shock was given. Conditions with respect to Wednesday, and Friday. Non-shock-day 
rest and food intake were approximated as_ testing was accomplished on Tuesday, Thurs- 
nearly as possible to those on shock days. day, and Saturday. The patient was admin- 
Ordinarily, shock was given on Monday, istered P, and B schedules for five days 
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before shock and nine days after the com- 
pletion of shock treatment. The diurnal 
factor was controlled in before and after test- 
ing as it was in testing during shock treat- 
ment. Table 2 shows the schedule of test 
administration. 


TABLE 2 


ScHEDULE oF Test ADMINISTRATION 








PROCEDURE 


P; Ps 





PSPS PSK DK DK Dd DM Bd Bd Bd Dd Dd Dd Bd Dd Dd Dd Dd Dd Dd Dt Dt Bt BB 
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x 
x 
x 
x 
x 
x 
x 
x 
x 
x 
x 
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x 
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Xx 
x 
xX 
x 
x 
x 
x 
xX 
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1949: January 15 


Totals 
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The Patient 


The patient, a 28-year-old woman with 
some college education, is the mother of three 
children. She was admitted to the hospital 
after attempted suicide. She had, previous to 
this attempt, been morose, tearful, insomniac, 
distraught, and had expressed ideas of being 
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persecuted. This behavior continued until 
ECT was begun. 

The patient had been living with her hus- 
band at the home of her parents until shortly 
before the onset of the present illness. Dur- 
ing her married life, she had allowed her 
mother to assume most of the responsibilities 
for the up-bringing of the children. The 
patient’s symptoms became noticeable after 
she and her husband and their family moved 
from the parental home to a home of their 
own. Early history indicates that she was 
regarded as a “nervous child.” Her mother 
is said to have been excessively dominant. 
Patient is also said to have been dominated 
by a younger sister. Patient was described 
as being sociable, hyperactive, somewhat 
irritable and sensitive prior to illness. 

Patient was diagnosed schizophrenic, cata- 
tonic type, at medical staff conference. She 
had continued unimproved clinically through 
two months of hospitalization before ECT 
was begun. A total of 10 ECT’s was given. 
The patient showed marked clinical improve- 
ment after the fourth shock. After the sixth 
shock she became euphoric, hyperactive, and 
somewhat confused. This behavior con- 
tinued until sometime after conclusion of 
treatment, when the patient is said to have 
“settled down.” She was discharged one 
month after termination of treatment, 
“improved.” 

She has been seen as an out-patient at semi- 
weekly intervals for seven months. She is 
said to be making a fairly good adjustment 
to a difficult situation engendered by marital 
difficulties. 

The following notes were taken by the 
examiner (E) on shock days, twenty minutes 
after shock: 


ECT 1—May 24—Patient crying, confused, wanted 
to know the date, how got into 
this room. Remembered tests. 
Cooperative at 30 minutes post 


shock. 


26—Contfused, didn’t remember what 
E wanted. Asked when she 
could go home. At 30 minutes 
post shock recalled test when 
presented—cooperative. 


ECT 2—May 


28—Discovered doing a jigsaw 
puzzle. Remembered E. At 30 
minutes post shock was cheerful 
and cooperative. 


ECT 3—May 
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ECT 4—June 2—Seemed in good contact, cheer- 


ful, glad to see E. 
4—Slightly euphoric and _hyper- 
active. Cooperative on tests. 
7—Patient very gay, restless, dis- 
tractible. 


g—Extremely active, gay, distract- 
ible. 
ECT 8—June 11—Incoherent, doesn’t remember E, 
but good humored and easily 
persuaded to take tests by 30 
minutes post ECT. 
ECT g—June 14—Behavior the same as ECT 8. 
ECT 1o—June 16—Complaining, hyperactive, con- 
fused—essentially the same as 
ECT 8 and 9, cooperative on 
tests. 


ECT 5—June 
ECT 6—June 


ECT 7—June 
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five days before treatment on the W-B test 
exceeds, significantly, improvement to be 
expected on the basis of prectice effect (16). 
Note particularly the improvement in Com- 
prehension and Picture Arrangement. If our 
findings can be equated with published test 
indications of prognosis, this patient’s prog- 
nosis was undoubtedly very good, even 
showing marked recovery on the tests fol- 
lowing hospitalization (35, 15, 5). Test re- 
sults seven months after shock suggest 
continued improvement. 


P, and B measures 


Table 4 shows the results obtained with 
measures taken on shock days immediately 


TABLE 3 


REsULTs WITH THE RORSCHACH AND WeEcHSLER-BELLevuE Tests 








60 Days 


Berore ECT 


5 Days 
Berore ECT 


30 Days 
Arter ECT 


7 MontTHs 
Arrer ECT 





Rorschach 


No. R's 
W, D, Dd 
F+% 
M:C 

C, CF, FC 
P 


26 
9,16,1 
93 
5:35 
0,3,1 


Wechsler-Bellevue 


Pull Scale IQ 110 
Verbal IQ 106 
Performance IQ 112 
Comprehension (wtd. score) 11 
Pict. Arrangement (wtd. score) 9 


129 
118 125 133 
137 132 141 
15 15 16 
18 13 17 


141 





REsuLts 

Rorschach and Wechsler-Bellevue 

Table 3 shows the results obtained with 
the Rorschach and Wechsler-Bellevue tests. 
These tests were administered 60 days before, 
five days before, and 30 days and seven 
months respectively, after the conclusion of 
treatments. Although there is undoubtedly 
some improvement after the conclusion of 
treatments, both tests are in agreement in 
showing that the patient’s major improve- 
ment occurred defore beginning shock treat- 
ment, although this fact was not clinically 
apparent. In fact, the decision to try ECT 
was made on the basis of the fact that no 
clinical improvement was apparent after two 
months of hospitalization. The improve- 
ment noted between 60 days before and 


before, one-half hour post, and two hours 
post shock. All the measures except the 
Orientation Test show a tendency to be 
affected when measured one-half hour post 
shock. Four measures (choice reaction time, 
ergograph, clotting time, and lymphocyte 
count) show a difference between pre-shock 
and one-half hour post-shock measures, sig- 
nificant at the 1 per cent level of confidence. 

With reaction time, the simple measures, 
both visual and auditory, seem to be less 
affected by ECT than the more difficult 
choice reaction time, which is to be expected. 
Other investigators have obtained similar 
results, likening such performance to results 
obtained from individuals suffering from 
brain damage (51, 13, 42, 43). 

The results obtained with the ergograph 
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TABLE 4 


P, ano B Measures on SHock Days 











MEAN VALUES 

I 
2 Hr. 
Post 


Pre 
SHOCK 


y, Hr. 


Post 


% Posr 
Minus Pre 


DIFFERENCES P-VaLuEs 


7 3 
2 Hr. Posr '% Post Minus 
Minus Pre 2 Hr. Post 





Visual RT 
SD Vis. RT 
Aud. RT 

SD Aud. RT 
Choice RT 
SD Choice RT 


Ergograph 
WBC 

Clotting Time 
Lymphocyte Ct. 
Orientation Test 
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-54 
.58 
-03 
-16 


-05 
-12 
03 
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are explainable on similar grounds, the in- 
creased length of baseline suggesting a per- 
severative factor often found in brain damage 
cases. Figure 1 shows, however, that most 
of the increase in the length of the baseline 
is attributable to the results obtained after 
the first three shocks. The measure taken 


oo Pre-snock 
© 24 be post shock 
oo? post sh 


ald . 


Successive Shock Days 


LENGTH OF BASELINE ON ERGOGRAPH ON 
Successive SHock Days 
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one-half hour after the fourth shock shows 
a dramatic shift corresponding to clinical 
improvement, the patient becoming some- 
what euphoric and impatient with the 
monotonous task of pulling the ergograph at 
this time. The ergograph measure returns 
to pre-shock level by two hours post shock. 
This graph has been presented as illustrative 
of the results obtained with other P, 
measures. 

Clotting time shows a significant drop 
when pre-shock and one-half hour post- 
shock times are compared. This finding sug- 


gests an adrenal-sympathetic reaction as a 
result of shock, a finding indicated by other 
investigators (12). Clotting time tends to 
return to pre-shock levels by two hours post 
shock. 

The lymphocyte reaction indicates a 
lymphocytosis immediately post shock, fol- 
lowed by a lymphopenia two hours post 
shock, a finding confirmed by the results of 
other investigators (26, 27). Whereas nor- 
mals tend to show a lymphopenia imme- 
diately after stress, Hoagland, Pincus, and 
Elmadjian (17, 34) have shown a lympho- 
cytosis for psychotic patients. Our results 
with a single case are in line with expecta- 
tions based on their work. 

The results of P, and B measures on shock 
days have in each case borne out the expec- 
tations based on the work of previous investi- 
gators, lending credence to the results of our 
study based on a single case. 


P, Measures 


P, measures were taken for six days before 
beginning shock treatment and for nine days 
after completion of shock treatment, making 
a total of 15 measures before and after treat- 
ment (see Table 1). In Table 5 these before- 
and after-treatment measures are compared 
with measures taken on non-shock days dur- 
ing the course of treatment. Of the com- 
parisons shown (Table 5) only one, color 
naming, shows a difference significant at the 
1 per cent level of confidence. Most of the 
measures to be discussed show either upward 
or downward trends as treatment progresses 
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and for several days after conclusion of 
treatment, making comparisons such as 
those shown in Table 5 difficult and not 
representative. 


TABLE 5 


P, anv B Measures on Non-Suock Days Pre 
amp Post ECT Measures VERSUS 
Durinc-ECT Measures 








MEANS P-VALUES OF 


Pre-Post Durinc DIFFERENCES 





rAT Blank 17. 
Color Naming 22. 
Continuous Addition II. 
Bender-Gestalt 

Ergograph— 

Length of Baseline 15. 
White Blood Count 10200 
Clotting Time 254 
Lymphocyte Count 2586 
Serial Subtraction 53-7 
Digit Span 14.8 
Mean Visual RT 15.5 
Mean Auditory RT 15.4 
Mean Choice RT 24.1 
Nouns Named 65.9 





All P. measures were  intercorrelated. 
Table 6 shows those correlation coefficients 
significant at the 5 per cent level of confi- 
dence or better. Color naming shows the 
following correlations, all significant at the 
1 per cent level of confidence. 

White blood count .70 
Lymphocyte count 53 
Clotting time .60 
Continuous addition 74 
Visual reaction time 53 
Auditory reaction time .73 
Choice reaction time _.68 

The blood measures, white blood count, 
lymphocyte count and clotting time, tend 
to show some continuing effect of ECT when 
measured 24 hours post shock, but color 
naming is correlated with them and shows a 
highly significant rise during the course of 
treatment. The color naming test might 
well, therefore, for our purpose be substituted 
for these more time-consuming tests. 

Of the measures of reaction time, only 
choice reaction time shows any noticeable 
tendency to rise during the course of treat- 
ment, but the measure is correlated very 
highly with color naming, which is a simpler, 
more easily administered test. 
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Continuous addition scores show a gradual, 
cumulative effect of ECT, indicating worsen- 
ing of performance with increasing numbers 
of shock, but both serial subtraction and 
noun naming show the same trend and are 
highly correlated with it. The continuous 
addition test is more complicated and time- 
consuming than either the noun naming or 
serial subtraction test. 

The number of words to a blank TAT 
card and the length of baseline on the ergo- 
graph both show a gradual decline of per- 
formance, due, insofar as can be ascertained, 
to the subject’s boredom with the task. They 
seem to add little to the study and when 
measured 24 hours post shock do not seem to 
be diagnostic of the patient’s adjustment. 

Digit span is a simple test and is fairly 
successful, but it correlates with the same tests 
as does color naming. It does not, therefore, 
seem worth while to retain it. 

We are left then with the following tests: 
noun naming, serial subtraction, color nam- 
ing, and Bender-Gestalt. Although noun 


naming and serial subtraction correlate fairly 
highly with each other, they do not correlate 
with color naming. Each of these is ex- 
tremely simple to administer and takes very 
little time. Each of them shows progressive 


impairment with increasing number of 
shocks, followed by a return to pre-shock 
levels by two weeks after conclusion of treat- 
ments, a finding in agreement with that of 
other investigators for the recovery of intel- 
lectual impairment after shock treatment (43, 
44, 5» 22, 39, 51, 50). Figure 2 shows the 
behavior on these two tests before, during, 
and after the course of treatment. Note that 
it was necessary to establish a baseline of 
performance some days prior to the begin- 
ning of treatment in order to minimize prac- 
tice effects and to have a basis of comparison 
for later performance. 

Figure 3 shows the results with the color 
naming test. Note that after the onset of 
treatment there is an initial rise, or worsen- 
ing of score, followed by a drop after the 
fourth shock, corresponding to clinically 
noticed improvement, a finding corroborated 
by other investigators (30). As treatments 
proceed, however, there is a gradual worsen- 
ing of performance, and performance does 
not reach pre-shock levels until two weeks 
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post shock, which again corroborates the 
findings of other workers. 

The Bender-Gestalt (B-G) Test does not 
correlate significantly with any other test at 
the P=.01 level. A good deal of recent work 
has indicated that scored deviations from the 
stimulus forms on this test differentiate very 
reliably between the records of psychiatric 
patients and non-patients (32, 33). Increased 
score on the test is correlated with increased 
severity of psychological disturbance. Several 
studies have indicated, on a qualitative basis, 
impaired performance on this test as a result 
of ECT (31, 42). Whatever the functions 
measured by the test, our results indicate the 
scores follow very well the course of clinical 
improvement. Figure 4 shows our patient's 
performance on this test. Note the drop in 
score (better performance) following the 
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fourth shock, a low point not reached again 
until over one month after conclusion of 
treatments. It will be remembered that 
after conclusion of treatments the patient 
became hyperactive and euphoric and that 
this behavior continued until four weeks 
after conclusion of treatments, at which 
time she is said to have “settled down.” The 
B-G scores represent the patient’s behavior 
fairly well during this period, finally show- 
ing a drop in score coincident with the 
patient’s “settling down” and the granting 
of full privileges prior to discharge. 
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Each of the four tests selected on the basis 
of performance and intercorrelations with 
other tests has followed the course of our 
patient’s progress during and after shock. 
Although performance on these tests is cor- 
roborated by the findings of other investi- 
gators with similar tests and, therefore, 
suggestive of the reliability of our findings, 
there remains the possibility that the coinci- 
dence of behavior and test performance is 
unique for this patient. We, therefore, ad- 
ministered the B-G, color naming, and noun 
naming tests to two more patients. (Serial 
subtraction was unfortunately omitted.) 
Both patients were females: Case A, 58, mar- 
ried, and Case B, 45, single—both diagnosed 
manic depressive, depressed. Both patients 
experienced a favorable reaction to a course 
of 12 electroconvulsive treatments. Both 
were discharged “improved” one month after 
conclusion of treatment. 
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Figures 5, 6, and 7 show results obtained 
with Case A. Tests were administered 24 
hours post shock. The patient experienced 
initial clinical improvement during the 
course of ECT, paralleled by low scores on 
B-G and color naming tests. This improve- 
ment was followed by a certain amount of 
euphoria and mental confusion lasting sev- 
eral days after the conclusion of ECT, re- 
flected in the B-G and color naming scores. 
Noun naming shows a progressive impair- 
ment through the course of ECT. Noun 
naming and color naming show a return to 
pre-shock levels by 10 days after completion 
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of treatment, at which time the patient was 
declared clinically improved and granted full 
privileges. Note that the B-G scores show 
the last five measures to be below pre-shock 
scores, the downward trend beginning four 
days after completion of treatment. Essen- 
tially the same results were obtained with 
Case B. 

Two male patients were now matched for 
age, education, and occupation. They were 
under the care of the same physician. One 
patient, Case C, diagnosed as an involutional 
psychotic with paranoid trends was given 
electroconvulsive therapy; the other, Case D, 
the control, diagnosed manic depressive, 
manic type, was not. Both of these patients 
improved after approximately nine weeks of 
hospitalization. Both were given the four 
tests at the same time (24 hours post shock 
on shock days) on the ward by nurses, who 
also kept the graphs shown in Figures 8, 9, 
10, and 113 

The four cases presented here show initial 
improvement as a result of ECT, followed by 
a worsening of scores as treatment progresses. 
Scores return to pre-shock levels or better 
several days after conclusion of treatment. 
If it is important to obtain mental impair- 
ment during the course of treatment as 
Lowenbach (23), Kalinowski(19), and others 
believe, then in each of our cases this impair- 
ment is objectively demonstrated. If it is 


8 The authors wish to express their gratitude to Miss 
McGibbons, Superintendent of Nurses, and to the nursing 
staff of the hospital for their cooperation throughout the 
experiment. 
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necessary to “knock out” recently learned 
habits, as some investigators suggest (36), 
then our results objectively demonstrate that 
recently learned habits are definitely impaired 
during the course of treatment. On the other 
hand, if one subscribes to the thesis that ECT 
should proceed at such a rate as to prevent 
mental confusion, as is maintained by Zis- 
kind (51), then our method indicates ‘the 
possibility of providing objective data upon 
which to space single shocks so that progres- 
sive impairment does not take place. 

All four of our cases show initial improve- 
ment and later impairment of mental func- 
tions, during the course of treatment, which 
correspond with a final favorable reaction to 
ECT. This finding confirms the opinion 
advanced by Lowenbach (23). 

Record keeping in the manner suggested 
provides objective evidence as to the onset 
and cessation of mental impairment during 
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and after treatment. It is suggested that such 
records can become as important and mean- 
ingful to the psychiatrist as records of pulse 
rate and temperature and that they will pro- 
vide objective evidence upon which to base 
clinical judgment. 


SUMMARY AND CONCLUSIONS 


A battery of tests tapping various levels of 
functioning was administered to a psychotic 
patient before, during, and after the course 
of electroconvulsive treatments. Test find- 
ings were related to the findings of other 
investigators. The tests administered were 
then intercorrelated, and four tests were 
found which followed the course of the 
patient’s progress as validated against clinical 
judgment. These four tests were the Bender- 
Gestalt, color naming, noun naming, and 
serial subtraction tests, all simply adminis- 
tered and quickly scored. Results with three 
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er patients undergoing treatment were 
sented, indicating the validity of the tests 

following the course of the patient’s 
It is suggested that the 
s be used regularly on patients undergoing 


ECT to provide objective evidence on which 
to base clinical judgment. 
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CRITIQUE AND NOTES 


“A RIGOROUS CRITERION OF FEEBLEMINDEDNESS”: A CRITIQUE* 


BY ROBERT H. CASSEL 
Research Psychologist, Training School, Vineland, New Jersey 


LTHOUGH critiques of scientific publications have 
long been thought to be of value, critical 
reviews of articles on the concept of feebleminded- 
ness are seldom found in the more recent literature. 
Recently, this journal published an important 
contribution by Jastak (5) entitled, “A Rigorous 
Criterion of Feeblemindedness.” It is believed, 
however, that the full significance of this article 
might not have been immediately apparent and 
that a critique would be of value. 

Jastak emphasizes the viewpoint that a diagnosis 
of feeblemindedness is warranted only when the 
subject’s psychometric scores, representing a large 
number of different abilities, are in every case 
below the second or third percentile for persons 
of the subject’s sex and age. He holds that the 
diagnosis should be entirely dependent on these 
psychometric scores. The purposes of this critique 
are: (a) to point out that other characteristics of 
the feebleminded were clearly implied in Jastak’s 
article, (b) to show the relationship of these other 
criteria to the psychometric criterion, and (c) to dis- 
cuss the significance of the psychometric criterion 
for a more inciusive concept of feeblemindedness. 

At the outset Jastak expresses alarm at the many 
follow-up studies on the paroled feebleminded 
which report these individuals to be living appar- 
ently normal community lives and earning a liveli- 
hood. Clearly, the implication here is that the 
feebleminded cannot do these things. Later in the 
article Jastak states (p. 376), “The truly feeble- 
minded, as this writer has known them, consistently 
fail to manage their affairs with ordinary prudence. 
They fail to attain average economic competence 
by a wide margin.” Obviously Jastak believes that 
the feebleminded are socially incompetent. Since 
he states that they never achieve social competence, 
he also believes that feeblemindedness is essentially 
incurable. 

In summary, Jastak states that the feebleminded 
are mentally subnormal and clearly implies that they 
are socially incompetent and that this condition is 
essentially incurable. Of course, there is no doubt 
that this condition is also developmental, i.e., it is 
amentia and not dementia. A synthesis of Jastak’s 
concept of feeblemindedness as expressed in this 


1 The writer is indebted to Mr. E. L. French, Chief 
Psychologist, Training School, for critical evaluation of 
this manuscript. 


article plus that which is implied would appear to 
be as follows: social incompetence, due to mental 
subnormality which is developmental and which is 
essentially incurable. 

This viewpoint is neither new nor startling. It 
is the traditional concept which dates back at least 
to the British Royal Commission of 1900, and its 
most prominent current exponent is probably 
Doll (2, 3, 4), who refers to it as a symptom-com- 
plex definition. The difficulty with the clinical 
application of this concept is the problem of what 
constitutes mental subnormality, and on this point 
Doll has never provided a completely satisfactory 
explanation. Jastak’s contribution is the suggestion 
of a method to determine, objectively, mental sub- 
normality. The “rigorous criterion” is really a 
rigorous criterion of mental subnormality and not 
of feeblemindedness, as stated in the title. 

Jastak introduces the concept (p. 372) that the 
ultimate level of a person’s latent intellectual power 
is represented by the highest single mental ability, 
psychometrically determined. The difference be- 
tween ultimate capacity and functional level is due 
to such things as abnormal character traits, per- 
sonality disturbances, and environmental depriva- 
tions. It is contended (p. 374) that it may be 
possible, through therapy for instance, to bring 
the functional level up to the level of latent 
intellectual power. This concept is essential to 
Jastak’s definition of the mental subnormality of the 
feebleminded. 

The fundament of Jastak’s viewpoint (p. 374) 
is as follows: “A person examined on a large num- 
ber of separate test scales representing as many 
different functions or abilities may be diagnosed 
as feebleminded only if he fails to surpass the 
lowest two or three percentiles on any one of the 
tests in comparison with the norms of a substantial 
random sampling of the population of his sex and 
age. When scores above these limits do occur, the 
presence of feeblemindedness may be excluded from 
serious consideration.” 

This concept is clearly and admittedly (p. 374) 
statistical, and it is also psychometric. It is stated 
that possibly from twenty to fifty mental functions 
will have to be tested to permit adequate diagnosis. 
On all of these tests the subject, to be classified 
as feebleminded, must score below the fourth 
percentile. 
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This statistical concept is different from that 
commonly expressed. The traditional statistical 
viewpoint was perhaps first enunciated by Pintner 
and Paterson (6) and later taken up by Terman, 
Wechsler, and others. They suggested that the 
feebleminded fall below a certain percentile on a 
well standardized test of general mental ability, 
whereas Jastak proposed to restrict the term feeble- 
minded to those who score below the fourth per- 
centile in each individual mental function. The 
issue here is whether a person can ultimately func- 
tion at the level of his highest measured ability, as 
Jastak believes, or whether a general subnormal 
functional level is, in almost all cases, permanent. 

The question may also be raised as to who, in 
terms of present classifications of the mentally re- 
tarded, would be feebleminded under Jastak’s cri- 
terion. What of the idiot savant? What of the 
so-called brain-injured exogenous child whose 
unevenness in abilities has been demonstrated by 
Werner, Strauss, and others? This “rigorous cri- 
terion” would appear to leave a substantial number 
of individuals who would be neither normal nor 
feebleminded. Since these individuals would have 
to be classified by some other term, Jastak’s criterion 
might be, in actual practice, but a semantic change. 

One of the things with which Jastak is con- 
cerned (p. 370) is the detection of “pseudofeeble- 
mindedness.” The rigorous criterion of mental 
subnormality will obviously be of assistance in 


those many instances where the pseudofeeble- 


mindedness is due to a mistake on the part of the 
diagnostician (1), for the not-feebleminded individ- 
ual will certainly score above the third percentile 


on some test of mental ability. However, whether 
the rigorous criterion will aid in the early identifi- 
cation of delayed development cases remains to be 
seen. It is conceivable that the early apparent 
retardation in these cases is general and applies to 
the range of mental abilities, in which case it might 
not be detected. 

A further valuable contribution of this article is 
Jastak’s plea for and insistence upon improving 
methods of individual measurement. Great em- 
phasis is placed upon the often overlooked relation- 
ship among personality, intellect, and achievement. 
The list of ten ways in which test scales must be im- 
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proved (p. 370) is, in general, excellent. Although 
consideration of this list shows that much work 
must be done before Jastak’s criterion of mental 
subnormality will be as thoroughly developed as he 
would like, his concept would appear to be strength- 
ened instead of weakened by these considerations. 

There are two additional points which should be 
dealt with, although neither is vital to the basis of 
the article. One is Jastak’s statement that one of 
the two main criteria commonly used to diagnose 
feeblemindedness is the social criterion. This cri- 
terion as defined as (p. 367), “. . . chronic and 
permanent inability of a person to adjust to what 
society considers a minimum level of independent 
existence.” It is only fair to ask where this cri- 
terion is in use. Jastak suggests Doll (3) as one 
of its exponents. Careful perusal of this reference 
shows that Doll, as mentioned previously, does not 
adhere to this criterion alone but to several, among 
which social incompetence is but one. In disposing 
of this criterion so deftly, Jastak may be disposing 
of something which does not exist. 

The second point is an issue which Jastak raised 
but did not dwell upon. On just what bases are 
children diagnosed as feebleminded? There is 
really no assurance that the theoretical concepts 
printed in the literature actually are used. From 
the standpoint of research and clinical application 
it might be quite profitable to find out just what 
criteria are used in the practical situations. 
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ON ROBERT H. CASSEL’S CRITIQUE OF “A RIGOROUS CRITERION 
OF FEEBLEMINDEDNESS” 


BY JOSEPH JASTAK 
Delaware State Hospital 


asset's feeling that the vital questions raised in 
t my article on feeblemindedness require fur- 
ther theoretical evaluation is a legitimate one. Such 
discussions will, however, be more fruitful after 
the proposed criterion has been experimentally 
checked and rechecked against developmental and 
therapeutic facts. At present the danger exists that 
the differences in viewpoint reflect merely the 
apperceptive backgrounds of the authors and not 
the factual materials at hand. 

Some of Cassel’s critical points deal chiefly with 
views inferentially read into the article. My cri- 
terion is not just a criterion of “mental subnor- 
mality,” as Cassel wishes it to be. It is a criterion 
of one specific type of subnormality differentiated 
from a number of other specific subnormalities. 
Mental subnormality is by no means a homoge- 
neous concept. It can be objectively demonstrated 
that there are as many mental subnormaiities in 
human beings as there are personality traits, and 
each trait is capable of simulating the symptoms 
and mechanisms of every other trait. For example, 
an early schizoid individual often behaves as if he 
were feebleminded. Yet the schizoid type of sub- 
normality may be distinguished from the intellectual 
type of subnormality. The practical methods of 
measuring other than intellective subnormalities 
have been omitted from my article. This omission 
is probably responsible for the misidentification 
of the vague notion of “mental subnormality” 
with my very concrete and specific concept of 
feeblemindedness. 

The impression of incurability gained from my 
article is also but an inference. Potentially, no 
mental defect is incurable. However, at this stage 
of medical and psychological science, we are not 
only not able to “cure” feeblemindedness but also 
a number of other mental deviations usually asso- 
ciated with high latent capacity. If neurophysiolo- 
gists leara to grow gray matter and to graft it on 
to a defective nervous system or succeed in improv- 
ing its blood supply, a cure of feeblemindedness 
may actually be achieved. Such cures may be 
highly improbable, but they are not impossible. 
Whether feeblemindedness is curable or not, is not 
one of the issues of my article. It merely attempts 
to differentiate between a relatively 
malignant and qualitatively dissimilar defects, all 
of which are nowadays placed in the same diag- 
nostic basket and conveniently but inaccurately 
labeled “mental subnormality.” 


variety of 


The feebleminded, if conscientiously diagnosed, 
fail to reach average social norms in any under- 
taking whatever. This failure does not, however, 
entitle us to the inference that they are socially 
unadjustable. I expressly state: “On the basis of 
intra-individual comparisons, low intellect in a 
normal personality may be remarkably success- 
ful...” within the level of its capacity. In fact, 
the realistic and volitionally strong moron (not 
pseudo-moron) may surpass in social productive- 
ness the unrealistic and motivationally weak genius. 
Therapy with the feebk minded is therefore a valid 
and hopeful procedure. Its aim is the development 
of social competency within a signally reduced 
range and level of cerebration. 

Cassel’s fear that a substantial number of persons 
would be neither normal nor feebleminded if our 
criterion were applied is fully justified. Such fear 
should not, however, deter us from facing unpleas- 
ant challenges squarely. Up to this point, such 
stubborn realities have remained hidden behind a 
facile but crude generalization. The confusion of 
etiological factors of failure invariably leads to the 
rationalization of our own diagnostic difficulties 
and to the by-passing of relevant remedial measures. 

Many young children with psychopathic or 
neurotic tendencies are nowadays called feeble- 
minded. Their failures to adjust may be socially 
very destructive. Their “subnormal” acts are, 
however, intellectually so complex that a truly 
feebleminded person would be incapable of com- 
prehending the values and motives involved in 
such acts, or of conceiving the methods of realizing 
them, or of inventing the excuses justifying them. 
Is the scientific separation of the subnormality of a 
psychopathic or neurotic personality from the sub- 
normality of defective intelligence a semantic 
change? Should the social treatment in all these 
cases be identical? 

Our system of psychological interpretation takes 
care not only of the “idiot-savant” and the brain- 
injured person but also of a large number of other 
mental conditions improperly evaluated with 
present methods. The “idiot-savant” is a diagnostic 
artifact based on clinical findings accorded some 
mystical meaning unrelated to the personality 
functioning of the examined individual. Most 
so-called idiot-savants are severely unorganized indi- 
viduals who misuse their fine capacities for some 
simple, iterative act which demands an astonishing 
degree of motivational control. These people pre- 
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sent a strictly psychiatric problem. They would 
indeed become “savants” in the fullest sense of the 
term if a cure could be found for their disorganized 
ways of thinking, feeling, willing, and behaving. 

Unevenness in the abilities of spastic and other- 
wise brain-injured children is strong evidence for 
the conclusion that they are not feebleminded but 
merely lack one or more of the specific media of 
expression of their initially good endowments. 
Dr. Phelps, the noted orthopedist, tells me that a 
mental survey of individuals afflicted with cerebral 
palsy in the State of New Jersey showed about 
80 per cent of them to be feebleminded. A similar 
survey made in Delaware showed only 20 per cent 
of them to be feebleminded, and even this high per- 
centage could be materially reduced if our diag- 
nostic tools were more adequate than they are. 
For purposes of rehabilitation and prognosis, it is 
important to distinguish between a disordered 
medium of expression (group factor) and absence 
of intelligence (general factor). 

Finally, Cassel asks where the social criterion is 
used in diagnosing feeblemindedness. The ques- 
tion can be fully answered only by well-planned 
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and comprehensive research with the aid of sound 
criteria of analysis. At present, we can only ven- 
ture a guess based on reports in the literature and 
on our daily clinical experiences. That its use is 
widespread and indiscriminate can be gleaned from 
nearly every article printed on the subject of feeble- 
mindedness. It is used even if the statistical cri- 
terion contradicts its accuracy. If it were not used, 
the large number of persons with high or rising 
1.Q.’s should never have been called feeblerninded. 

In the Delaware clinics we have “cured” many a 
feebleminded child in short order after the same 
child had been diagnosed elsewhere as feeble- 
minded on no more substantial evidence than the 
reported fact of social failure. We rehabilitate 
these children not by some magic powers but by 
applying rigorous criteria to the report of failure, 
by establishing the relevant genotypes, and by 
recommending psychotherapy appropriate to the 
problems revealed in the course of the examination. 
Some of these children were so obviously intelligent 
that only users of the social criterion or I.Q. testers 
could have mistaken them for feebleminded. 
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BY 
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University of Chicago 
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AND 
DAVID ROSENTHAL 
University of Chicago 


ounpD administration of a mental hospital re- 

quires an experimental approach to the intro- 
duction of new therapies, staff in-service training 
programs, and other changes in patient routine. 
The ultimate measure of the value of any hospital 
program is its effects on the patients, reflected not 
only in changes in remission and discharge rates 
but also in the adequacy of patients’ adjustments 
on the wards—to one another, to the hospital staff, 
and to the hospital schedule. The experimental 
selection of a ward (or wards) in which a new 
program is introduced and of comparable wards 
which are kept on existing routine allows for a 
comparison of experimental and control patients 
according to specific evaluative criteria and as 
separate social units. On the basis of such com- 


parison, decision about extending the use of a new 
program may be more soundly based. 

It would seem that no matter what change in 
hospital procedure was being studied, its effects on 
life in the wards should be systematically investi- 
gated. Instruments for assaying such changes 
objectively are needed. Hyde and York reported 
on one device (1). Use of their technique provided 
a record on individual patients’ behavior in social 
situations. Since observers must be able to follow 
individual patients through the observational period, 
use of this device seems to be limited to observations 
of about 15 persons at any one time. Often, how- 
ever, there is double or triple that number on a 
ward day-room. 

The technique which is presented in this article 
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was developed and used over a two-year period as 
part of a larger study of group psychotherapy with 
hospitalized schizophrenic patients. Its use re- 
quires regularly scheduled (e.g. weekly) five- 
minute observations by two co-observers over a 
period of time. Its purpose is to give a picture of 
trends in amount and type of patients’ activities and 
in extent of interpersonal communication on a ward 
as a unit. The data obtaiasd on any ward may 
then show changes over tit.c on that ward and 
may serve for comparing it with another ward or 
section in the hospital. As an illustration of the 
application of the technique, ‘ts use and resultant 
findings with some of the patients in the Group 
Psychotherapy Research Project are presented. 


DEVELOPMENT OF THE TECHNIQUE 


In the study of group psychotherapy with hos- 
pitalized schizophrenic patients, there were an ex- 
perimental and a control ward. One hypothesis 
was that there would be a carryover of certain types 
of social behavior from the therapy groups to the 
experimental patients living outside of the therapy 
groups. 

At least, there were questions about this. Would 
patients talk more to one another as a result of 
group therapy? Would they show a greater interest 
in each other? Would they band together more? 
Would they take a greater interest in outside activi- 
ties and surroundings, especially those which are 
culturally determined, such as reading, writing, 
listening to the radio? There was no source of data 
in existing hospital records which would yield this 
information. Procedures had to be devised through 
which such information could be obtained. 

It was first thought that evidences of a sociali- 
zation effect might best be observed in the activities 
of the patients in other organized groups. Patients 
were therefore observed in their physical recon- 
ditioning classes, on the ball field, and in occupa- 
tional therapy. In these situations roughly twenty 
patients from either the control or the experimental 
wards were observed at a time. These groups were 
of a size that permitted narrative recording and 
observations on individual patients similar to 
records made of group therapy meetings. The 
patients in these organized groups, however, were 


1 This article is in part drawn from a report on a 
research project in group psychotherapy directed by 
Florence Powdermaker, M.D., and Jerome D. Frank, 
M.D. A full report is to be published in book form. 
The research was conducted under the auspices of the 
Washington School of Psychiatry under contract with and 
sponsored by the Veterans Administration. This article 
is published with the approval of the Chief Medical 
Director. The statements and conclusions published by 
the authors are the result of their own study and do not 
necessarily reflect the opinion or policy of the Veterans 
Administration. 
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especially selected because of suitability for the 
given activity, and this introduced a bias in sam- 
pling. Moreover, in these early observations, dif- 
ferences in behavior between experimental and 
control paticnts seemed quite probably to be affected 
as much by the differences in activity and by the 
frequent changes in the instructor-in-charge as by 
the known variable of group psychotherapy. 

To obtain a more representative sample of the 
patients under study and to minimize undesirable 
variations in the settings, observations were shifted 
to the relatively unstructured life in the day-rooms, 
where almost all the patients could be observed. 
The large numbers of patients seen simultaneously, 
however, their greater freedom of movement, and 
the absence of a prescribed focus such as activity 
or hospital staff leader presented special problems 
of observation and recording. Narrative accounts 
were not feasible because activities observed were 
often discontinuous, discrete, and simultaneous. 
Identification of individual patients in these large 
groups was not possible. A scheme for recording 
observations in other than narrative form had to be 
developed. It became apparent, moreover, that in 
the larger observational field at least two persons 
were needed to observe simultaneously. Their joint 
observations, after many trial runs, resulted in 
reliable, quantitative, and comparable descriptions 
of some of the behavior on the experimental and 
control wards. Two of the writers (H. S. M. and 
E. V.) developed the observation instrument and 
used it during the first year of research. Three 
hospital staff workers ? served in pairs as observers 
the second year. The reliability varied between 90 
and 100 per cent agreement among the five ob- 
servers participating. The categories for observa- 
tion were formalized in the Ward Observation 
Record, illustrated below. 


Use oF THE TECHNIQUE 


During the first year, observations were made at 
intervals in all four day-rooms, each for two five- 
minute periods in a given day. The ward sections 
(open experimental, open control, locked experi- 
mental, and locked control) were visited in varying 
sequence, once in the morning and once in the 
afternoon. During the second year the sections 
were observed by different pairs of observers for 
one five-minute period each, on a given day, alter- 
nating mornings and afternoons in successive 
weeks.3 


2 Marcia E. Kensinger and John A. Powers, social case 
workers, and Roland A. Fitzpatrick, clinical psychologist, 
on the staff at the hospital, gave their time as observers 
because of their interest in the research. 

8 When the observers could not follow the observation 
schedule, however, observations had to be for 
that week. 
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To start with, a large number of categories of 
observation was tried. Those that were found to 
lack sufficient precision were dropped. On the 
final form, tallies were entered on observations of 
(1) motor activities, (2) observable perceptual 
responses to impersonal stimuli, (3) verbal evidences 
of interpersonal relationships, and (4) non-verbal 
evidences of interpersonal relationships. Tallying 
was done without regard for length or repetition 
of an activity, the observational unit being the 
patient(s)-activity. 

When the observers entered the day-rooms, they 
made no overtures to any of the patients. They sat 
in chairs against the wall and responded briefly to 
any overtures made to them. The first entry on 


day-room, waxing the floor, washing the windows, 
climbing over chairs, or engaging in similar be- 
havior was scored, but movements like autistic 
mannerisms, changes in catatonic posturing, stand- 
ing up, or sitting down were not scored. 

Under the category of “external stimuli,” such 
activities as reading, writing, working a puzzle, 
playing solitaire, or listening to the radio were 
scored. Tallies for reading, the activity which 
occurred most frequently, were kept separate from 
the others. In all other cases where activities of 
this sort were scored, a patient had to give a clear 
indication of his activity, as for example, changing 
the station or whistling with the music when listen- 
ing to the radio. 


WARD OBSERVATION RECORD 


EXTERNAL STIMULI 
RE Khan dcidsdeusescckeseveaeveetoseses oe 


Writing & 
Other 


DESCRIPTIONS: 


the record during the five-minute observation was 
the total number of patients in the day-room at the 
time of the observers’ entry. If additional patients 
entered during the five minutes, they were added 
to “Total number,” but those leaving the day-room 
were not subtracted from the total. For “Climate” 
a two or three word description of the feelings or 
mood in the day-room was entered; such phrases 
as “sleepy” or “hyperactive” sufficed. Tallying 
of patient(s)-activities was done under the appro- 
priate categories on the record sheet. 

Under “motor,” a patient’s goal-directed walking 
was tallied once as such without regard for the 
distance he moved or any repetition of this activity 
during the observation period. A second patient’s 
walking was given a second tally in the same cate- 
gory. The category “pacing” was distinguished 
from “walking” in that the former was not per- 
ceptibly directed toward any external goal. Under 
“other” (motor activities) a patient’s sweeping the 


Climate 


RELATING 


Under “relating,” one tally was given for an 
observed verbal or non-verbal communication to 
another person. Criteria for non-verbal relating 
were based on definite activity observed showing 
interest in another person, that is, purposeful move- 
ment toward another for a cigarette light, or catch- 
ing the eye or attention of the observer and 
gesturing meaningfully. The casual glance which 
was observed but where no other attempt at com- 
munication was made was not scored. Two patients 
playing cards with each other received one tally 
as a patient relation under either the verbal or 
non-verbal category according to whether they 
spoke to each other. Three or more patients talk- 
ing in a group or silently working cn a jig-saw 
puzzle received one score under “cluster” for either 
verbal or non-verbal relating, respectively. A 
patient’s verbal or non-verbal communications with 
an attendant, nurse, or other hospital personnel 
were tallied under “staff.” Verbal or non-verbal 
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communications with both the observers (5) or 
with either the female (f) or the male (m) ob- 
server were tallied in the appropriate places. 
Activities such as whistling, laughing, and singing 
were not scored except where externally determined. 
For example, laughing was not scored when it 
apparently was in response to the patient’s own 
hallucinations but was scored if it seemed to be in 
response to some activity on the ward. 
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detailed breakdown during the second year of the 
four major categories in which the wards were 
compared. A similar breakdown was not made 
for the first year observations because of the smaller 
number of observation periods (N14). Since 
experimental patients had received more therapy 
by the second year, there was more interest in the 
differences revealed at that time. 

In the observations of the second year, a bias was 


TABLE 1 


SIGNIFICANCE 


Ratios OBTAINED FROM Data ON WARD OBSERVATIONS * 








CLosep Warps 
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CATEGORY 
1st YEAR 


Open Warps 


1st YEAR 2nD YEAR 








Motor Activity 

Responses to External Stimuli 
Verbal Relating 

Non-verbal Relating 


.06 E/C 
-94 E/C 
.81 E/C 


1.60 E/C 


1.94 E/C 
.81 E/C 
-44 C/E 

2.39 E/C 


-64 E/C 
-59 E/C 
1.16 E/C 
1.17 E/C 
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5 per cent 


f requ 


level — 
significance, 5 per cent level — 2.06, 1 per cent level 


2.16, 1 per cent level — 3.01 
- 2.79 


TABLE 2 


SicNIFICANCE RaTios FoR OBSERVATIONS DURING THE SECOND YEAR ® 








CATEGORY 


Crosep Warps Open Warps 








Motor Activity (Total) 
Directed Walking 
Pacing 
Other 

Responses to External Stimuli (Total) 
Reading 
Other 

Verbal Relating (Total)t 
To Self 
To Staff 
To Female Observer 
To Male Observer 

Non-verbal Relating (Total)t 





.23 E/C 
-25 E/C -90 
-37 C/E -II 
C/E 06 
E/C -59 
E/C -64 
E/C +43 
-95 E/C -16 
-30 C/E .16 
E/C -63 
E/C .82 
E/C .69 
E/C 1.17 


-64 E/C 
E/C 
C/E 
C/E 
E/C 
E/C 
C/E 

/C 
E/C 
E/C 
E/C 
C/E 
E/C 








© N = 25; ¢ required for significance at 5 per cent level — 2.06. 
# required for significance at 1 per cent level — 2.79. 
t ¢ values were not computed for all the divisions of these categories 


significant. 


Comparable quantitative representations were 
obtained by translating the raw scores for each 
category into percentages by dividing the tallied 
patient(s)-activities by the total number of patients 
in the day-room during the five-minute observation 
period. Comparisons of the wards revealed by this 
method are presented in Figure 1 and Tables 1 
and 2. Table 1 presents a comparison of observed 
differences between the first and second years. 
Figure 1 presents a comparison of the wards in 
regard to the category which most distinguished 
the wards during the second year. Table 2 is a 


cause the differences were so small they were obviously not 


introduced which cannot be evaluated and which 
may seriously prejudice the data. This bias was 
the introduction to the experimental ward of nine 
patients * who were not randomly selected but who 
had been hospitalized a shorter period of time and 
had not had any shock therapy. These patients 
were counted with the others because of the prac- 
tical problems involved in any effort to exclude 
them from these observations. 


* Since each ward beds 82 patients, the bias involves 
about one-ninth of the experimental ward. 
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Resutts AND Discussion 


In Table 1, the only category which distinguishes 
the two wards to any degree approaching statistical 
significance during the second year is “verbal 
relating” on the closed wards. The difference 
between the wards, both closed and open, increased 
from the first to the second year in favor of the 
experimental ward. Although no other general 
category clearly differentiates the wards during 
experimental 
This common 


the second year, in every 


patients exceed the control patients. 


category 


trend might well represent an effect of group 
therapy.5 
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;. 1. Percentace or Patients’ VERBAL RELATING 
IN THE Four Warp SEcTIONS 


Although the differences between the wards 
(Table 1) increased from the first to the second 
year in “verbal relating,” they decreased markedly 
in “motor activity” and less markedly in “non- 
verbal relating.” This might reflect one of the 
common aims held by all group therapists—to 
encourage patients to verbalize rather than act out 
their feelings. Experimental patients on the closed 
ward manifested a greater increase in “responses 
to external stimuli” from the first to the second 
year, but the difference decreased slightly on the 
open wards although still favoring the experimental 
patients. This might indicate that some patients 
do show greater interest in outside affairs as a result 
of group therapy. 


5 Since this trend was already manifest in the first 
year, it is possible that the imbalance of controls during 
the second, caused by the substitution on the experi- 
mental ward of nine less chronic patients, was perhaps 
not very great. Nevertheless the data should be inter- 
preted with caution and accepted with reservation. 
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The breakdown in Table 2 of the general cate- 
gories during the second year reveals the following 
points of interest. 

First, in regard to “motor activity,” experimental 
patients on both closed and open wards walked 
purposefully more often than did control patients, 
while control patients paced aimlessly more fre- 
quently. Control patients on the closed wards, 
however, engaged in “other motor activities” more 
often. Since the bulk of the activities in this last 
sub-category concerned menial tasks on the ward, 
this might confirm an inference made on the basis 
of other data, that control patients were “better” 
(more docile and obedient) hospital patients. In 
no instance, however, do the differences reach sig- 
nificance at the 5 per cent level. 

In regard to “responses to external stimuli,” 
although experimental patients seemed to do more 
reading, control patients on the open ward paid 
more attention to other stimuli. Interpretation of 
these differences is questionable. 

Under “verbal relating,” on the closed wards, 
experimental patients talked more often to other 
people while control patients talked slightly more 
often to themselves. On the open wards, control 
patients talked more to the male observers but 
experimental patients were more verbal in other 
respects. Other than the fact that experimental 
patients tended to engage in “non-verbal relating” 
more frequently, no point of special interest was 
indicated by a breakdown of this category. 


SUMMARY 


In this use of the Ward Observation Record, 
group psychotherapy seemed to have a generally 
stimulating effect on patients in regard to more 
outwardly purposeful activities and verbal relation- 
ships with other people. While the findings in this 
study are not conclusive, the use of this technique, 
together with other methods for the study of the 
processes and effects of a new treatment or training 
procedure, seems to offer hospital administrators an 
objective basis for evaluating and deciding upon 
the extension of their programs. 
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Directep Tuinkinc. By George Humphrey. New 
York: Dodd, Mead and Co., 1948. Pp. 229. 


It is as pleasant as it is unusual to find a sound 
scholar who can clothe his wisdom with wit and 
charm. Such a scholar is George Humphrey, for 
many years professor of psychology and philosophy 
at Queens University and now the first occupant 
of the recently established chair of psychology at 
Oxford. Unlike so many popular writers on the 
art of thinking, Humphrey has at no time sacrificed 
his scientific integrity. The problems he discusses 
are the problems with which psychologists have 
always been concerned. He attacks them in straight- 
forward fashion, skillfully combining well-docu- 
mented evidence from the laboratory with illustra- 
tions from everyday life, but never evading a prob- 
lem because of its difficulty or pretending that it 
has been solved when such is not the case. Humph- 
rey’s bobk is frankly a book for the layman rather 
than for the psychologist. One hazards the guess, 
however, that most psychologists as they read it will 
realize to their surprise that they are learning some- 
thing new about the psychology of thinking, and 
having read it they will recommend it to their 
students and their colleagues. 

Rosert B. MacLeop 

Cornell University 


Tue INprvipvat AND His Rewicion: A PsycHoLocicaL 
INTERPRETATION. By Gordon W. Allport. New 
York: Macmillan, 1950. Pp. x-+142. $2.50. 


Prof. Allport’s psychological portrayal of the re- 
ligious sentiment in personality-structure is a much- 


needed work. Its reading by any conscientious 
inquirer should provide a salutary antidote to over- 
secularized college curricula, to calcified ecclesias- 
ticism, to spurious amalgams of psychiatry and 
religion, and to magisterial positivism, psycho- 
analysis, dialectic materialism, and naive naturalism, 
and to those social theories wherein the individual 
is a passive recipient of acculturation. 

Before our dialectitians “pure” scientists 
cursorily label this book as just another lamented 
addition to contemporary “failure of nerve,” they had 
better take heed. Prof. Allport has lost none of his 
characteristic talent for bringing to bear upon a 
psychological problem both the scrupulosity of the 
scientist and the sensitivity of the artist. Moreover, 
his critics will soon find the ground cut from under 
them, because he has chosen to deal primarily not 
with rudimentary, or neurotic, or psychotic but with 


and 


mature manifestations of the religious sentiment. 
The author, in other words, has gone to the most 
reputable authorities in the field, those productive 
personalities throughout history who have them- 
selves undergone transforming experiences which 
have theoretically and practically integrated their 
lives “without remainder.” 

The customary scientific approach to this subject 
usually commits both the genetic and the “natural- 
istic” fallacies of explaining end-products in terms 
of their origins, as though tke latter could ever fully 
exhaust the quality, direction and meaning of the 
former. Religion, we are monotonously reminded, 
is mere infantile reflexology, or visceral impulses, 
or childhood wishes, fixations, projections, etc. A 
comparable kind of enlightenment would result 
were a theologian to explain science in terms of 
primitive magic, or incipient curiosity, or a pre- 
genital compulsion for orderliness. A more sophis- 
ticated but equally pernicious fallacy consists in ad- 
mitting more than one characteristic feature of the 
religious orientation, but accepting at the same time 
the analytical validity of reducing the whole to its 
component parts. This is an ancient confusion 
which mistakes the distributive for the collective, 
enumeration for synthesis. 

These illogicalities, among others, are unwitting 
testimony to a more ambitious scepticism which 
attempts to define religious experience in terms of 
pure subjectivism but which finds itself coerced 
to define all experience in similar terms. It is sheer 
prejudice to assert that since there are subjective 
causes for an experience, there can be no objective 
causes as well. Moreover, it is inexcusable naiveté 
to buttress such prejudice with an oversimplifted 
psychological theory which reduces percepts and 
concepts to sensations, and with a misapplied prin- 
ciple of parsimony which assumes that a psycho- 
logical cause renders all other causes superfluous. 
Wisdom does not lie in posing false polarities; both 
poles may be equally true or they may not be 
mutually exclusive. If the religious experience is to 
be condemned as illusory in its reference to an 
objective reality because such experience is a state of 
mind, then so may science and art be equally guilty 
of solipsism. Obviously all mental events occur in 
minds, but such tautology tells us nothing about the 
moral, religious, aesthetic, and scientific orders to 
which mental states refer, orders whose validity is 
substantiated by empirical as well as inferential evi- 
dence. Psychological theories to the contrary, the 
theistic hypothesis (immanent and/or transcendent) 
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has never been seriously threatened, at least not by 
the adduced evidence which I have examined. In 
the empirical argument for theism, God is, of course, 
no discursive hypothesis but an indubitable fact of 
immediate apprehension. Actually, the most alleg- 
edly damaging attacks which psychology and even 
abnormal psychology have been able to muster are 
not at all inconsistent with theism. Any objective 
order, for instance, no more necessitates an absolute 
agreement in the reports of its nature than it necessi- 
tates a universal awareness of its presence. As for 
the emotionally disturbed, the eccentric, the phys- 
ically or mentally tormented, these may still be 
capable of penetrating insights whether the victims 
be ordinary men or those of artistic or scientific abil- 
ity. Theism, like all hypotheses, is subject to critical 
analysis, but not from any atomistic psychologism. 

Prof. Allport, by stimulating contrast, develops 
the position that a man’s mature religion is the 
“audacious bid he makes to bind himself to creation 
and to the Creator. It is his ultimate attempt to 
enlarge and to complete his own personality by find- 
ing the supreme context in which he rightly 
belongs.” The psychological framework which such 
a position implies was developed some years ago in 
the author’s Personality, where he argued that the 
criteria of maturity are first, a variety of psycho- 
genic interests in ideals and values, second, an abil- 
ity to objectify oneself by means of reflection and 
insight, and third, a unifying philosophy providing 
ethical direction and coherence. The religious senti- 
ment, despite its organic and egocentric origins, 
evolves into an autonomous, discriminative, affective- 
ideational-conative system, permitting a patterning 
or differentiation inclusive of the widest interests. 
It assumes, like personality itself, diverse forms de- 
pending upon the believer’s culture, temperament, 
capacity, values, intentions, and the logical and 
meaningful methods of justifying his faith and lim- 
ited certitude. Contrary to such slick, monistic 
causation as escapism, wishfulness, anticipation, or 
anthropomorphism, the religious quest is never com- 
pletely accomplished, is solitary, arduous and ulti- 
mately tragic. As Prof. Trueblood reminds us, the 
nature of God, to judge from religious experience, 
has been different from what men originally desired, 
different in many cases from what they expected, 
and “wholly Other” than themselves. No one can 
provide the believer with any neatly-packaged faith, 
nor can he, in Prof. Allport’s words, “prescribe for 
him his pact with the cosmos.” Furthest removed 
from dogmatism, the religious sentiment is char- 
acterized by an “heuristic” quality which enables it 
constantly to search and discover both within man 
and in the universe all the highest values consonant 
with that sentiment’s most comprehensive purposes. 
It is this very quality, I venture, and not an a priori 
consideration of some panlogism, that lends justi- 
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fication to many mature religionists for subscribing 
to a coherence theory of truth, a theory which 
stresses the value of cumulative evidence. 

It should come as no surprise, therefore, that 
religion, which deals with the most inclusive relation- 
ships should, as the author points out, be the “most 
controversial, the most doubt-ridden, the most 
elusive” of mental activities. Doubt always plagues 
those aspects of all investigations which call for 
interpretation; this holds true for natural as well as 
for social science. The unsophisticated alone speak 
of either scientific or religious certitude, but only 
deductive or mathematical logic can by its very 
assumptions vouchsafe such assurance. Incidentally, 
beyond and underlying the “certitudes” of science 
are the still problematic concepts of natural law, 
determinism, uniformity, induction, etc., and the 
non-scientific reality of human purpose, creativeness 
and personality. Prof. Allport marshals for rapid 
inspection and cogently refutes the more common 
forms of doubt, ranging from those primarily reac- 
tive, those associated with violated self-interest, with 
institutional malpractices, up to those connected 
with group projection and rationalization. His 
lengthiest consideration, of course, concerns scien- 
tific doubting. 

The scientist has: his own frame of professional 
orientation: a routinized microscopism, the use of 
closely relevant hypotheses, objectivity, etc., which 
finds the religious sentiment alien and at times dis- 
tasteful. It is important to stress, however, that this 
attitude toward religion need not follow logically 
from the orientation. A detailed preoccupation 
with a given subject-matter cannot deny cosmic 
hypotheses; actually, it presupposes them. The limi- 
tations of science are imposed by the necessities of 
its methodology; if the scientist refuses to concern 
himself with the metaphysical foundations upon 
which his own edifice rests (rationality, order, rela- 
tionship, etc.), or if he fails to see that his skeletal 
version of the world, his abstractive ideal of mathe- 
matical structure and closed terminological system, 
is completely inadequate to explain the values men 
cherish most deeply (aesthetics, philosophy, love, 
friendship, religion), he has not yet learned that 
knowledge cannot be limited to science. There are 
different areas of knowledge with their correspond- 
ing methods and their degrees of meaning for the 
complete personality. It is the essence of provincial- 
ism to argue from one’s lack of experience of an 
object to a denial of the object, particularly if one 
has not met the required conditions of the experi- 
ence. 

Consistent with his recurrent emphasis upon the 
individual differences which characterize the opera- 
tion of the religious sentiment, Prof. Allport con- 
cludes with the various psychological means, recip- 
rocally supportive, by which the believer validates 
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his faith. There is, first, the warranted confidence 
that God, like other objects, is inherent in man’s 
intentions, Goals are resident in the striving, 
especially those that are religious, which must have 
been implanted by the Creator and Conserver of all 
values. Thus, belief is both a result of striving and 
a reasonable inference from that very striving. The 
concept of God here becomes, in Montague’s mem- 
orable phrase, a “momentous possibility that what is 
highest in spirit is also deepest in nature.” There 
are, second, the old a priori and the empirical ver- 
sions of the ontological argument. Third, there are 
the cosmological and teleological approaches to sat- 
isfy those of rational temper who, however, initially 
may also be moved by the awe-inspiring structure 
of the physical and moral universe and by the im- 
mediate experience, functional or noetic, of God’s 
presence in prayer, worship, ritual, or unsolicited 
grace. The classical metaphysical arguments, along 
with customary sources of knowledge such as sense 
perception, reason, insight and the experiences of 
others, supplement what James called the believer's 
private “transaction,” which need not necessarily be 
a bizarre or esoteric mysticism. “Transaction” could 
thus be framed to include, as the author does, re- 
vealed religion, although this form of validation has 
traditionally referred also to sacred symbols, writ- 
ings, etc. The final validation is the pragmatic, 
made famous by James’ “will-to-believe,” which is a 
productive option in that its faith generates values 
conducive to a unified life. 

My major criticism of the book stems from its 
exiguous proportions. During the course of his re- 
corded indebtedness in the preface, Prof. Allport 
acknowledges, among other things, the help of those 
who tried to minimize his psycho- and ethnocen- 
trism. And then he adds with characteristic modesty 
that his critical advisers failed to correct fully his 
failings, because a psychologist is an intruder in the 
vast areas of religion, theology and philosophy. 
There is enough in this book alone to justify 
anyone’s impressions that its author is by no means 
an alien in these areas and that had he chosen, he 
could have elaborated some of his views and added 
new ones which would have strengthened his posi- 
tion, especially with relation to the secularized 
reader, who would most naturally be interested 
in a historico-scientific form of theistic validation. 
There is, for instance, the somewhat neglected story 
of intellectual usurpation on the part of contem- 
porary naturalism, which many assume to be the only 
philosophy logically consistent with science and 
which has for the past thirty years or so gradually 
appropriated unto itself ethical, aesthetic, and other 
psychogenic values completely inconsistent with its 
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original foundations. Contrary to much of accepted 
history, it was not only the religionists who initially 
attacked the inadequacies of nineteenth-century con- 
cepts of natural law, reductionism, mechanical deter- 
minism, natural selection, materialistic monism, etc. 
—concepts which could provide limited explanations 
for macroscopic movement, for structure, for gross 
survival, but not for any viable axiology. It was the 
scientists themselves, as well as philosophers and 
metaphysicians, who were responsible for such de- 
velopments as vitalism, dialectic materialism, emer- 
gent and creative evolutions, critical realism and 
other pluralistic reformulations of the original 
naturalistic creed, realizing the necessity for explain- 
ing the sciences themselves, let alone the arts, in 
terms logically associated in the first place with 
theism, e.g., order, reason, purpose, novelty, free- 
dom, etc. Thus, had the author fully exploited this 
form of validation he could also have done more 
justice to his discussion of the complex relationship 
between personal religion and morality, or to the 
problem of evil, which is treated rather sketchily. 

There is an unmistakable weighting throughout 
the book in the direction of psychocentrism; this 
weighting was not inevitable as Prof. Allport seems 
to think, and it seems unfortunately to minimize in 
the “transaction” the initiating power of that Ante- 
cedent and Superhuman Reality, that sustaining, 
compelling and redeeming “Other” which the re 
ligionist comes to believe is the only source and end 
of man’s being. 

The reader will be richly rewarded not only by 
the author’s analysis of the central theme but by 
his comments on supplementary topics which can 
only be mentioned here: the religion of youth, the 
psychology of religious intention, referential doubting 
or the apparent conflicts of scientific and religious 
discourse, and the relationship between psycho- 
therapy and religion, to which Prof. Allport devotes 
a chapter rich in explication and suggestion. In the 
course of his discussion concerning the complexity 
and variety of the religious sentiment, he anticipates 
the uncongenial reception which his book will prob- 
ably elicit from some scientists, historians, sociolo- 
gists and churchmen. He rightly maintains that dis- 
interested scrutiny of his position will ultimately ben- 
efit psychology, social science and theology. Should 
his hopes prove illusory, Prof. Allport might find 
bracing respite in Samuel Johnson’s challenging, “I 
have found you an argument, but I am not obliged 
to find you an understanding.” He will, however, 
most likely continue, following Plato, to remind men 
of what is implicit in their lives. 

Gerorce KIMMELMAN 

Philadelphia, Pa. 
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